Site-specific incorporation of redox active amino acids into proteins

ABSTRACT

Compositions and methods of producing components of protein biosynthetic machinery that include orthogonal tRNAs, orthogonal aminoacyl-tRNA synthetases, and orthogonal pairs of tRNAs/synthetases, which incorporate redox active amino acids into proteins are provided. Methods for identifying these orthogonal pairs are also provided along with methods of producing proteins with redox active amino acids using these orthogonal pairs.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to and benefit of United StatesProvisional Patent Application U.S. Ser. No. 60/511,532, filed Oct. 14,2003, the disclosure of which is incorporated herein by reference in itsentirety for all purposes.

STATEMENT AS TO RIGHTS TO INVENTIONS MADE UNDER FEDERALLY SPONSOREDRESEARCH AND DEVELOPMENT

This invention was made with government support under Grant No. GM 66494from the National Institutes of Health and support under GrantDE-FG03-00ER45812 from the Department of Energy. The government may havecertain rights to this invention.

FIELD OF THE INVENTION

The invention is in the field of translation biochemistry. The inventionrelates to compositions and methods for making and using orthogonaltRNAs, orthogonal aminoacyl-tRNA synthetases, and pairs thereof, thatincorporate redox active amino acids into proteins. The invention alsorelates to methods of producing proteins in cells using such pairs andrelated compositions.

BACKGROUND OF THE INVENTION

Among the twenty common genetically encoded amino acids only cysteineundergoes facile redox chemistry, and as a result can participate in awide variety of enzyme catalyzed oxidation and reduction reactions(Surdhar and Armstrong (1987) J. Phys. Chem., 91:6532-6537; Licht et al.(1996) Science 271:477-481). Consequently, most biological redoxprocesses require cofactors such as flavins, nicotinamides and metalions. In rare cases, quinones, derived from the post-translationalmodification of tyrosine and tryptophan side chains, are used as theredox cofactor (Stubbe and Van der Donk (1998) Chem. Rev., 98:705-762).For example, bovine plasma copper amine oxidase uses3,4,6-trihydroxy-L-phenylalanine (TOPA) in the conversion of primaryamines and molecular oxygen to aldehydes and hydrogen peroxide,respectively (Janes et al. (1990) Science 248:981-987). These amino acidderived redox catalysts are generated by both radical-mediated andenzymatic reactions (Rodgers and Dean (2000) Int. J. Biochem. CellBiol., 32:945-955). Clearly, the ability to genetically encodeadditional redox active amino acids, rather than generate them bycomplex post-translational mechanisms, would significantly enhance theability to both study and engineer electron transfer processes inproteins. This invention fulfills these and other needs, as will beapparent upon review of the following disclosure.

SUMMARY OF THE INVENTION

The invention provides compositions and methods of producing orthogonalcomponents for incorporating-redox active amino acids into a growingpolypeptide chain in response to a selector codon, e.g., stop codon, anonsense codon, a four or more base codon, etc., e.g., in vivo. Forexample, the invention provides orthogonal-tRNAs (O-tRNAs), orthogonalaminoacyl-tRNA synthetases (O-RSs) and pairs thereof. These pairs can beused to incorporate redox active amino acids into growing polypeptidechains.

In some embodiments, a composition of the invention can includ anorthogonal aminoacyl-tRNA synthetase (O-RS), where the O-RSpreferentially aminoacylates an O-tRNA with a redox active amino acid.In certain embodiments, the O-RS comprises an amino acid sequencecomprising SEQ ID NO.: 1, or a conservative variation thereof. Incertain embodiments of the invention, the O-RS preferentiallyaminoacylates the O-tRNA with an efficiency of at least 50% of theefficiency of a polypeptide comprising an amino acid sequence of SEQ IDNO.: 1.

A composition that includes an O-RS can optionally further includes anorthogonal tRNA (O-tRNA), where the O-tRNA recognizes a selector codon.Typically, an O-tRNA of the invention includes at least about, e.g., a45%, a 50%, a 60%, a 75%, a 80%, or a 90% or more suppression efficiencyin the presence of a cognate synthetase in response to a selector codonas compared to the O-tRNA comprising or encoded by a polynucleotidesequence as set forth in SEQ ID NO.: 2. In one embodiment, thesuppression efficiency of the O-RS and the O-tRNA together is, e.g., 5fold, 10 fold, 15 fold, 20 fold, 25 fold or more greater than thesuppression efficiency of the O-tRNA lacking the O-RS. In one aspect,the suppression efficiency of the O-RS and the O-tRNA together is atleast 45% of the suppression efficiency of an orthogonal tyrosyl-tRNAsynthetase pair derived from Methanococcus jannaschii.

A composition that includes an O-tRNA can optionally include a cell(e.g., a non-eukaryotic cell, such as an E. coli cell and the like, or aeukaryotic cell), and/or a translation system.

A cell (e.g., a non-eukaryotic cell, or a eukaryotic cell) comprising atranslation system is also provided by the invention, where thetranslation system includes an orthogonal-tRNA (O-tRNA); an orthogonalaminoacyl-tRNA synthetase (O-RS); and, a redox active amino acid.Typically, the O-RS preferentially aminoacylates the O-tRNA with anefficiency of at least 50% of the efficiency of a polypeptide comprisingan amino acid sequence of SEQ ID NO.: 1. The O-tRNA recognizes the firstselector codon, and the O-RS preferentially, aminoacylates the O-tRNAwith the redox active amino acid. In one embodiment, the O-tRNAcomprises or is encoded by a polynucleotide sequence as set forth in SEQID NO.: 2, or a complementary polynucleotide sequence thereof. In oneembodiment, the O-RS comprises an amino acid sequence as set forth inany one of SEQ ID NO.: 1, or a conservative variation thereof.

A cell of the invention can optionally further comprise an additionaldifferent O-tRNA/O-RS pair and a second unnatural amino acid, e.g.,where this O-tRNA recognizes a second selector codon and this O-RSpreferentially aminoacylates the O-tRNA with the second unnatural aminoacid. Optionally, a cell of the invention includes a nucleic acid thatcomprises a polynucleotide that encodes a polypeptide of interest, wherethe polynucleotide comprises a selector codon that is recognized by theO-tRNA.

In some embodiments, a cell of the invention includes an E. coli cellthat includes an orthogonal-tRNA (O-tRNA), an orthogonal aminoacyl-tRNAsynthetase (O-RS), a redox active amino acid, and a nucleic acid thatcomprises a polynucleotide that encodes a polypeptide of interest, wherethe polynucleotide comprises the selector codon that is recognized bythe O-tRNA. In certain embodiments of the invention, the O-RSpreferentially aminoacylates the O-tRNA with an efficiency of at least50% of the efficiency of a polypeptide comprising an amino acid sequenceof SEQ ID NO.: 1.

In certain embodiments of the invention, an O-tRNA of the inventioncomprises or is encoded by a polynucleotide sequence as set forth in SEQID NO.: 2, or a complementary polynucleotide sequence thereof. Incertain embodiments of the invention, an O-RS comprises an amino acidsequence as set forth in SEQ ID NO.: 1, or a conservative variationthereof. In one embodiment, the O-RS or a portion thereof is encoded bya polynucleotide sequence encoding an amino acid as set forth in 1 SEQID NO.: 1, or a complementary polynucleotide sequence thereof.

The O-tRNA and/or the O-RS of the invention can be derived from any of avariety of organisms (e.g., eukaryotic and/or non-eukaryotic organisms).

Polynucleotides are also a feature of the invention. A polynucleotide ofthe invention includes an artificial (e.g., man-made, and not naturallyoccurring) polynucleotide comprising a nucleotide sequence encoding anamino acid as set forth in SEQ ID NO.: 1, and/or is complementary to orthat encodes a polynucleotide sequence of the above. A polynucleotide ofthe invention also includes a nucleic acid that hybridizes to apolynucleotide described above, under highly stringent conditions, oversubstantially the entire length of the nucleic acid. A polynucleotide ofthe invention also includes a polynucleotide that is, e.g., at least75%, at least 80%, at least 90%, at least 95%, at least 98% or moreidentical to that of a naturally occurring tRNA (but a polynucleotide ofthe invention is other than a naturally occurring tRNA). Artificialpolynucleotides that are, e.g., at least 80%, at least 90%, at least95%, at least 98% or more identical to any of the above and/or apolynucleotide comprising a conservative variation of any the above, arealso included in polynucleotides of the invention.

Vectors comprising a polynucleotide of the invention are also a featureof the invention. For example, a vector of the invention can include aplasmid, a cosmid, a phage, a virus, an expression vector, and/or thelike. A cell comprising a vector of the invention is also a feature ofthe invention.

Methods of producing components of an O-tRNA/O-RS pair are also featuresof the invention. Components produced by these methods are also afeature of the invention. For example, methods of producing at least onetRNA that are orthogonal to a cell (O-tRNA) include generating a libraryof mutant tRNAs; mutating an anticodon loop of each member of thelibrary of mutant tRNAs to allow recognition of a selector codon,thereby providing a library of potential O-tRNAs, and subjecting tonegative selection a first population of cells of a first species, wherethe cells comprise a member of the library of potential O-tRNAs. Thenegative selection eliminates cells that comprise a member of thelibrary of potential O-tRNAs that is aminoacylated by an aminoacyl-tRNAsynthetase (RS) that is endogenous to the cell. This provides a pool oftRNAs that are orthogonal to the cell of the first species, therebyproviding at least one O-tRNA. An O-tRNA produced by the methods of theinvention is also provided.

In certain embodiments, the methods further comprise subjecting topositive selection a second population of cells of the first species,where the cells comprise a member of the pool of tRNAs that areorthogonal to the cell of the first species, a cognate aminoacyl-tRNAsynthetase, and a positive selection marker. Using the positiveselection, cells are selected or screened for those cells that comprisea member of the pool of tRNAs that is aminoacylated by the cognateaminoacyl-tRNA synthetase and that shows a desired response in thepresence of the positive selection marker, thereby providing an O-tRNA.In certain embodiments, the second population of cells comprise cellsthat were not eliminated by the negative selection.

Methods for identifying an orthogonal-aminoacyl-tRNA synthetase for aredox active amino acid for use with an O-tRNA are also provided. Forexample, methods include subjecting to selection a population of cellsof a first species, where the cells each comprise: 1) a member of aplurality of aminoacyl-tRNA synthetases (RSs), (e.g., the plurality ofRSs can include mutant RSs, RSs derived from a species other than afirst species or both mutant RSs and RSs derived from a species otherthan a first species); 2) the orthogonal-tRNA (O-tRNA) (e.g., from oneor more species); and 3) a polynucleotide that encodes a positiveselection marker and comprises at least one selector codon.

Cells (e.g., a host cell) are selected or screened for those that showan enhancement in suppression efficiency compared to cells lacking orhaving a reduced amount of the member of the plurality of RSs. Theseselected/screened cells comprise an active RS that aminoacylates theO-tRNA. An orthogonal aminoacyl-tRNA synthetase identified by the methodis also a feature of the invention.

Methods of producing a protein in a cell, e.g., a non-eukaryotic cell,such as an E. coli cell or the like, or a eukaryotic cell) with a redoxactive amino acid at a specified position are also a feature of theinvention. For example, a method includes growing, in an appropriatemedium, a cell, where the cell comprises a nucleic acid that comprisesat least one selector codon and encodes a protein, providing the redoxactive amino acid, and incorporating the redox active amino acid intothe specified position in the protein during translation of the nucleicacid with the at least one selector codon, thereby producing theprotein. The cell further comprises: an orthogonal-tRNA (O-tRNA) thatfunctions in the cell and recognizes the selector codon; and, anorthogonal aminoacyl-tRNA synthetase (O-RS) that preferentiallyaminoacylates the O-tRNA with the redox active amino acid. A proteinproduced by this method is also a feature of the invention.

The invention also provides compositions that include proteins, wherethe proteins comprise a redox active amino acid (e.g.,3,4-dihydroxy-L-phenyalanine (DHP), a 3,4,5-trihydroxy-L-phenylalanine,a 3-nitro-tyrosine, a 4-nitro-phenylalanine, a 3-thiol-tyrosine, and/orthe like). In certain embodiments, the protein comprises an amino acidsequence that is at least 75% identical to that of a therapeuticprotein, a diagnostic protein, an industrial enzyme, or portion thereof.Optionally, the composition comprises a pharmaceutically acceptablecarrier.

Definitions

Before describing the invention in detail, it is to be understood thatthis invention is not limited to particular biological systems, whichcan, of course, vary. It is also to be understood that the terminologyused herein is for the purpose of describing particular embodimentsonly, and is not intended to be limiting. As used in this specificationand the appended claims, the singular forms “a”, “an” and “the” includeplural referents unless the content clearly dictates otherwise. Thus,for example, reference to “a cell” includes a combination of two or morecells; reference to “bacteria” includes mixtures of bacteria, and thelike.

Unless defined herein and below in the reminder of the specification,all technical and scientific terms used herein have the same meaning ascommonly understood by one of ordinary skill in the art to which theinvention pertains.

Orthogonal: As used herein, the term “orthogonal” refers to a molecule(e.g., an orthogonal tRNA (O-tRNA) and/or an orthogonal aminoacyl tRNAsynthetase (O-RS)) that functions with endogenous components of a cellwith reduced efficiency as compared to a corresponding molecule that isendogenous to the cell or translation system, or that fails to functionwith endogenous components of the cell. In the context of tRNAs andaminoacyl-tRNA synthetases, orthogonal refers to an inability or reducedefficiency, e.g., less than 20% efficiency, less than 10% efficiency,less than 5% efficiency, or less than 1% efficiency, of an orthogonaltRNA to function with an endogenous tRNA synthetase compared to anendogenous tRNA to function with the endogenous tRNA synthetase, or ofan orthogonal aminoacyl-tRNA synthetase to function with an endogenoustRNA compared to an endogenous tRNA synthetase to function with theendogenous tRNA. The orthogonal molecule lacks a functionally normalendogenous complementary molecule in the cell. For example, anorthogonal tRNA in a cell is aminoacylated by any endogenous RS of thecell with reduced or even zero efficiency, when compared toaminoacylation of an endogenous tRNA by the endogenous RS. In anotherexample, an orthogonal RS aminoacylates any endogenous tRNA a cell ofinterest with reduced or even zero efficiency, as compared toaminoacylation of the endogenous tRNA by an endogenous RS. A secondorthogonal molecule can be introduced into the cell that functions withthe first orthogonal molecule. For example, an orthogonal tRNA/RS pairincludes introduced complementary components that function together inthe cell with an efficiency (e.g., 45% efficiency, 50% efficiency, 60%efficiency, 70% efficiency, 75% efficiency, 80% efficiency, 90%efficiency, 95% efficiency, or 99% or more efficiency) as compared tothat of a control, e.g., a corresponding tRNA/RS endogenous pair, or anactive orthogonal pair (e.g., a tyrosyl orthogonal tRNA/RS pair).

Orthogonal tyrosyl-tRNA: As used herein, an orthogonal tyrosyl-tRNA(tyrosyl-O-tRNA) is a tRNA that is orthogonal to a translation system ofinterest, where the tRNA is: (1) identical or substantially similar to anaturally occurring tyrosyl-tRNA, (2) derived from a naturally occurringtyrosyl-tRNA by natural or artificial mutagenesis, (3) derived by anyprocess that takes a sequence of a wild-type or mutant tyrosyl-tRNAsequence of (1) or (2) into account, (4) homologous to a wild-type ormutant tyrosyl-tRNA; (5) homologous to any example tRNA that isdesignated as a substrate for a tyrosyl-tRNA synthetase in TABLE 1, or(6) a conservative variant of any example tRNA that is designated as asubstrate for a tyrosyl-tRNA synthetase in TABLE 1. The tyrosyl-tRNA canexist charged with an amino acid, or in an uncharged state. It is alsoto be understood that a “tyrosyl-O-tRNA” optionally is charged(aminoacylated) by a cognate synthetase with an amino acid other thanlysine, e.g., with the amino acid homoglutamine. Indeed, it will beappreciated that a tyrosyl-O-tRNA of the invention is advantageouslyused to insert essentially any amino acid, whether natural orartificial, into a growing polypeptide, during translation, in responseto a selector codon.

Orthogonal tyrosyl amino acid synthetase: As used herein, an orthogonaltyrosyl amino acid synthetase (tyrosyl-O-RS) is an enzyme thatpreferentially aminoacylates the tyrosyl-O-tRNA with an amino acid in atranslation system of interest. The amino acid that the tyrosyl-O-RSloads onto the tyrosyl-O-tRNA can be any amino acid, whether natural orartificial, and is not limited herein. The synthetase is optionally thesame as or homologous to a naturally occurring tyrosyl amino acidsynthetase, or the same as or homologous to a synthetase designated as atyrosyl-O-RS in TABLE 1. For example, the tyrosyl-O-RS can be aconservative variant of a tyrosyl-O-RS of TABLE 1, and/or can be atleast 50%, 60%, 70%, 80%, 90%, 95%, 98%, 99% or more identical insequence to a tyrosyl-O-RS of TABLE 1.

Cognate: The term “cognate” refers to components that function together,e.g., an orthogonal tRNA and an orthogonal aminoacyl-tRNA synthetase.The components can also be referred to as being complementary.

Preferentially aminoacylates: The term “preferentially aminoacylates”refers to an efficiency, e.g., 70% efficiency, 75% efficiency, 85%efficiency, 90% efficiency, 95% efficiency, or 99% or more efficiency,at which an O-RS aminoacylates an O-tRNA with a redox active amino acidas compared to the O-RS aminoacylating a naturally occurring tRNA or astarting material used to generate the O-tRNA.

Selector codon: The term “selector codon” refers to codons recognized bythe O-tRNA in the translation process and not recognized by anendogenous tRNA. The O-tRNA anticodon loop recognizes the selector codonon the mRNA and incorporates its amino acid, e.g., an unnatural aminoacid, such as a redox active amino acid, at this site in thepolypeptide. Selector codons can include, e.g., nonsense codons, suchas, stop codons, e.g., amber, ochre, and opal codons; four or more basecodons; rare codons; codons derived from natural or unnatural base pairsand/or the like.

Suppressor tRNA: A suppressor tRNA is a tRNA that alters the reading ofa messenger RNA (mRNA) in a given translation system, e.g., by providinga mechanism for incorporating an amino acid into a polypeptide chain inresponse to a selector codon. For example, a suppressor tRNA can readthrough, e.g., a stop codon, a four base codon, a rare codon, etc.

Suppression activity: As used herein, the term “suppression activity”refers, in general, to the ability of a tRNA (e.g., a suppressor tRNA)to allow translational read-through of a codon (e.g. a selector codonthat is an amber codon or a 4-or-more base codon) that would otherwiseresult in the termination of translation or mistranslation (e.g.,frame-shifting). Suppression activity of a suppressor tRNA can beexpressed as a percentage of translational read-through activityobserved compared to a second suppressor tRNA, or as compared to acontrol system, e.g., a control system lacking an O-RS.

The present invention provides various means by which suppressionactivity can be quantitated. Percent suppression of a particular O-tRNAand ORS against a selector codon (e.g., an amber codon) of interestrefers to the percentage of activity of a given expressed test marker(e.g., LacZ), that includes a selector codon, in a nucleic acid encodingthe expressed test marker, in a translation system of interest, wherethe translation system of interest includes an O-RS and an O-tRNA, ascompared to a positive control construct, where the positive controllacks the O-tRNA, the O-RS and the selector codon. Thus, for example, ifan active positive control marker construct that lacks a selector codonhas an observed activity of X in a given translation system, in unitsrelevant to the marker assay at issue, then percent suppression of atest construct comprising the selector codon is the percentage of X thatthe test marker construct displays under essentially the sameenvironmental conditions as the positive control marker was expressedunder, except that the test marker construct is expressed in atranslation system that also includes the O-tRNA and the O-RS.Typically, the translation system expressing the test marker alsoincludes an amino acid that is recognized by the O-RS and O-tRNA.Optionally, the percent suppression measurement can be refined bycomparison of the test marker to a “background” or “negative” controlmarker construct, which includes the same selector codon as the testmarker, but in a system that does not include the O-tRNA, O-RS and/orrelevant amino acid recognized by the O-tRNA and/or O-RS. This negativecontrol is useful in normalizing percent suppression measurements toaccount for background signal effects from the marker in the translationsystem of interest.

Suppression efficiency can be determined by any of a number of assaysknown in the art. For example, a β-galactosidase reporter assay can beused, e.g., a derivatived lacZ plasmid (where the construct has aselector codon n the lacZ nucleic acid sequence) is introduced intocells from an appropriate organism (e.g., an organism where theorthogonal components can be used) along with plasmid comprising anO-tRNA of the invention. A cognate synthetase can also be introduced(either as a polypeptide or a polynucleotide that encodes the cognatesynthetase when expressed). The cells are grown in media to a desireddensity, e.g., to an OD₆₀₀ of about 0.5, and O-galactosidase assays areperformed, e.g., using the Betafluor™ O-Galactosidase Assay Kit(Novagen). Percent suppression can be calculated as the percentage ofactivity for a sample relative to a comparable control, e.g., the valueobserved from the derivatized lacZ construct, where the construct has acorresponding sense codon at desired position rather than a selectorcodon.

Translation system: The term “translation system” refers to thecomponents that incorporate an amino acid into a growing polypeptidechain (protein). Components of a translation system can include, e.g.,ribosomes, tRNAs, synthetases, mRNA and the like. The O-tRNA and/or theO-RSs of the invention can be added to or be part of an in vitro or invivo translation system, e.g., in a non-eukaryotic cell, e.g., abacterium (such as E. coli), or in a eukaryotic cell, e.g., a yeastcell, a mammalian cell, a plant cell, an algae cell, a fungus cell, aninsect cell, and/or the like.

Unnatural amino acid: As used herein, the term “unnatural amino acid”refers to any amino acid, modified amino acid, and/or amino acidanalogue, such as a redox active amino acid, that is not one of the 20common naturally occurring amino acids or seleno cysteine orpyrrolysine.

Derived from: As used herein, the term “derived from” refers to acomponent that is isolated from or made using a specified molecule ororganism, or information from the specified molecule or organism.

Positive selection or screening marker: As used herein, the term“positive selection or screening marker” refers to a marker that whenpresent, e.g., expressed, activated or the like, results inidentification of a cell, which comprise the trait, e.g., cells with thepositive selection marker, from those without the trait.

Negative selection or screening marker: As used herein, the term“negative selection or screening marker” refers to a marker that, whenpresent, e.g., expressed, activated, or the like, allows identificationof a cell that does not comprise a selected property or trait (e.g., ascompared to a cell that does possess the property or trait).

Reporter: As used herein, the term “reporter” refers to a component thatcan be used to identify and/or select target components of a system ofinterest. For example, a reporter can include a protein, e.g., anenzyme, that confers antibiotic resistance or sensitivity (e.g.,β-lactamase, chloramphenicol acetyltransferase (CAT), and the like), afluorescent screening marker (e.g., green fluorescent protein (e.g.,(GFP), YFP, EGFP, RFP, etc.), a luminescent marker (e.g., a fireflyluciferase protein), an affinity based screening marker, or positive ornegative selectable marker genes such as lacZ, β-gal/lacZ(β-galactosidase), Adh (alcohol dehydrogenase), his3, ura3, leu2, lys2,or the like.

Eukaryote: As used herein, the term “eukaryote” refers to organismsbelonging to the phylogenetic domain Eucarya such as animals (e.g.,mammals, insects, reptiles, birds, etc.), ciliates, plants (e.g.,monocots, dicots, algae, etc.), fungi, yeasts, flagellates,microsporidia, protists, etc.

Non-eukaryote: As used herein, the term “non-eukaryote” refers tonon-eukaryotic organisms. For example, a non-eukaryotic organism canbelong to the Eubacteria (e.g., Escherichia coli, Thermus thermophilus,Bacillus stearothermophilus, etc.) phylogenetic domain, or the Archaea(e.g., Methanococcus jannaschii (Mj), Methanosarcina mazei (Mm),Methanobacterium thermoautotrophicum (Mt), Methanococcus maripaludis,Methanopyrus kandleri, Halobacterium such as Haloferax volcanii andHalobacterium species NRC-1, Archaeoglobus fulgidus (Af), Pyrococcusfuriosus (Pf), Pyrococcus horikoshii (Ph), Pyrobaculum aerophilum,Pyrococcus abyssi, Sulfolobus solfataricus (Ss), Sulfolobus tokodaii,Aeuropyrum pernix (Ap), Thermoplasma acidophilum, Thermoplasmavolcanium, etc.) phylogenetic domain.

Conservative variant: As used herein, the term “conservative variant,”in the context of a translation component, refers to a translationcomponent, e.g., a conservative variant O-tRNA or a conservative variantO-RS, that functionally performs similar to a base component that theconservative variant is similar to, e.g., an O-tRNA or O-RS, havingvariations in the sequence as compared to a reference O-tRNA or O-RS.For example, an O-RS will aminoacylate a complementary O-tRNA or aconservative variant O-tRNA with an unnatural amino acid, e.g., a redoxactive amino acid such as 3,4-dihydroxy-L-phenylalanine (HP), althoughthe O-tRNA and the conservative variant O-tRNA do not have the samesequence. The conservative variant can have, e.g., one variation, twovariations, three variations, four variations, or five or morevariations in sequence, as long as the conservative variant iscomplementary to the corresponding O-tRNA or O-RS.

Selection or screening agent: As used herein, the term “selection orscreening agent” refers to an agent that, when present, allows forselection/screening of certain components from a population. Forexample, a selection or screening agent can be, but is not limited to,e.g., a nutrient, an antibiotic, a wavelength of light, an antibody, anexpressed polynucleotide, or the like. The selection agent can bevaried, e.g., by concentration, intensity, etc.

In response to: As used herein, the term “in response to” refers to theprocess in which a tRNA of the invention recognizes a selector codon andmediates the incorporation the redox active amino acid, which is boundto tRNA, into the growing polypeptide chain.

Encode: As used herein, the term “encode” refers to any process wherebythe information in a polymeric macromolecule or sequence string is usedto direct the production of a second molecule or sequence string that isdifferent from the first molecule or sequence string. As used herein,the term is used broadly, and can have a variety of applications. In oneaspect, the term “encode” describes the process of semi-conservative DNAreplication, where one strand of a double-stranded; DNA molecule is usedas a template to encode a newly synthesized complementary sister strandby a DNA-dependent DNA polymerase.

In another aspect, the term “encode” refers to any process whereby theinformation in one molecule is used to direct the production of a secondmolecule that has a different chemical nature from the first molecule.For example, a DNA molecule can encode an RNA molecule (e.g., by theprocess of transcription incorporating a DNA-dependent RNA polymeraseenzyme). Also, an RNA molecule can encode a polypeptide, as in theprocess of translation. When used to describe the process oftranslation, the term “encode” also extends to the triplet codon thatencodes an amino acid. In some aspects, an RNA molecule can encode a DNAmolecule, e.g., by the process of reverse transcription incorporating anRNA-dependent DNA polymerase. In another aspect, a DNA molecule canencode a polypeptide, where it is understood that “encode” as used inthat case incorporates both the processes of transcription andtranslation.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 provides a schematic illustration of the oxidative products of3,4-dihydroxy-L-phenylalanine (DHP; structure 1) to DHP-semiquinoneradical 2, which is readily oxidized to DHP-quinone 3.

FIGS. 2A and 2B provide illustrations of DHP dependent expression ofsperm whale myoglobin as a response to an amber codon at position 4 inthe Mb gene. FIG. 2A provides a silver stained gel and western blot.FIG. 2B provides an ESI-QqTOF mass spectrum analysis of DHPMb.

FIGS. 3A and 3B provide cyclic voltammograms. FIG. 3A provides cyclicvoltammograms of the heme group in wtMb and DHPMb. FIG. 3B proivdescyclic voltammograms of DHP for different solutions containing: 100 μMDHP, the wtMb or DHPMb. All voltammograms were recorded in 0.1 Mphosphate buffer, pH 7.4, under argon; scan rate: 1 V s-1 vs. SCE.

FIG. 4 provides an illustration of the oxidation of DHPelectrochemically within a protein.

FIG. 5 provides the nucleotide and amino acid sequences of Methanococcusjannaschii tyrosyl-tRNA synthetase (MjTyrRS).

FIG. 6 provides the nucleotide and amino acid sequences of3,4-dihydroxy-L-phenylalanine (DHP)-tRNA synthetase (DHPRS), based onMethanococcus jannaschii tyrosyl-tRNA synthetase (MjTyrRS) having theamino acid changes: Tyr32→Leu, Ala67→Ser, His70→Asn, and Ala167→Gln. Thechanged amino acids and corresponding triplet codons (relative to thewild-type sequence) are boxed.

DETAILED DESCRIPTION

In order to add additional synthetic amino acids, such as a redox activeamino acid, to the genetic code, in vivo, new orthogonal pairs of anaminoacyl-tRNA synthetase and a tRNA are needed that can functionefficiently in the translational machinery, but that are “orthogonal,”to the translation system at issue, meaning that it functionsindependently of the synthetases and tRNAs endogenous to the translationsystem. Desired characteristics of the orthologous pair include tRNAthat decode or recognize only a specific new codon, e.g., a selectorcodon, that is not decoded by any endogenous tRNA, and aminoacyl-tRNAsynthetases that preferentially aminoacylate (or charge) its cognatetRNA with only a specific redox active amino acid. The O-tRNA is alsonot typically aminoacylated by endogenous synthetases. For example, inE. coli, an orthogonal pair will include an aminoacyl-tRNA synthetasethat does not cross-react with any of the endogenous tRNA, e.g., whichthere are 40 in E. coli, and an orthogonal tRNA that is notaminoacylated by any of the endogenous synthetases, e.g., of which thereare 21 in E. coli.

This invention provides compositions of and methods for identifying andproducing additional orthogonal tRNA-aminoacyl-tRNA synthetase pairs,e.g., O-tRNA/O-RS pairs that can be used to incorporate a redox activeamino acid. An O-tRNA of the invention is capable of mediatingincorporation of a redox active amino acid into a protein that isencoded by a polynucleotide, which comprises a selector codon that isrecognized by the O-tRNA, e.g., in vivo. The anticodon loop of theO-tRNA recognizes the selector codon on an mRNA and incorporates itsamino acid, e.g., a redox active amino acid at this site in thepolypeptide. An orthogonal aminoacyl-tRNA synthetase of the inventionpreferentially aminoacylates (or charges) its O-tRNA with only aspecific redox active amino acid.

For example, the redox active amino acid 3,4-dihydroxy-L-phenylalanine(DHP), which can undergo two electron oxidation to a quinone has beenincorporated selectively and efficiently into proteins in an organism,e.g., Escherichia coli (E. coli), in response to a selector codon, e.g.,TAG codon. See FIG. 1. DHP can be oxidized electrochemically within theprotein. The ability to incorporate a redox active amino acidsite-specifically into proteins can facilitate the study of electrontransfer in proteins, as well as enable the engineering of a redoxprotein with novel properties. See FIG. 4. For example, expression ofredox active proteins can facilitate the study and the ability to alterelectron transfer pathways in proteins, alter catalytic function ofenzymes, crosslink protein with small molecules and biomolecules, etc.

Orthogonal tRNA/Orthogonal Aminoacyl-tRNA Synthetases and Pairs Thereof

Translation systems that are suitable for making proteins that includeone or more unnatural amino acids, e.g., redox active amino acids, aredescribed in International Publication Numbers WO 20021086075, entitled“METHODS AND COMPOSITION FOR THE PRODUCTION OF ORTHOGANOLtRNA-AMINOACYLtRNA SYNTHETASE PAIRS” and WO 2002/085923, entitled “INVIVO INCORPORATION OF UNNATURAL AMINO ACIDS.” In addition, seeInternational Application Number PCT/US2004/011786, filed Apr. 16, 2004.Each of these applications is incorporated herein by reference in itsentirety. Such translation systems generally comprise cells (e.g.,non-eukaryotic cells, or eukaryotic cells) that include an orthogonaltRNA (O-tRNA), an orthogonal aminoacyl tRNA synthetase (O-RS), and aredox active amino acid, where the O-RS aminoacylates the O-tRNA withthe redox active amino acid. An orthogonal pair of the inventionincludes of an O-tRNA, e.g., a suppressor tRNA, a frameshift tRNA, orthe like, and an O-RS. Individual components are also provided in theinvention.

The O-tRNA recognizes a selector codon and includes at least about,e.g., a 45%, a 50%, a 60%, a 75%, a 80%, or a 90% or more suppressionefficiency in the presence of a cognate synthetase in response to aselector codon as compared to the O-tRNA comprising or encoded by apolynucleotide sequence as set forth in SEQ ID NO.: 2. The O-RSaminoacylates the O-tRNA with the redox active amino acid. The cell usesthe components to incorporate the redox active amino acid into a growingpolypeptide chain, e.g., via a nucleic acid that comprises apolynucleotide that encodes a polypeptide of interest, where thepolynucleotide comprises a selector codon that is recognized by theO-tRNA. In certain embodiments of the invention, a cell such as an E.coli cell that includes an orthogonal tRNA (O-tRNA), an orthogonalaminoacyl-tRNA synthetase (O-RS), a redox active amino acid; and, anucleic acid that comprises a polynucleotide that encodes a polypeptideof interest, where the polynucleotide comprises the selector codon thatis recognized by the O-RNA. The translation system can also be an invitro system.

In one embodiment, the suppression efficiency of the O-RS and the O-tRNAtogether is about, e.g., 5 fold, 10 fold, 15 fold, 20 fold, or 25 foldor more greater than the suppression efficiency of the O-tRNA lackingthe O-RS. In one aspect, the suppression efficiency of the O-RS and theO-tRNA together is at least about, e.g., 35%, 40%, 45%, 50%, 60%, 75%,80%, or 90% or more of the suppression efficiency of an orthogonaltyrosyl-tRNA synthetase pair derived from Methanococcus jannaschii.

The invention optionally includes multiple O-tRNA/O-RS pairs in a cell,which allows incorporation of more than one unnatural amino acid, e.g.,a redox active amino acid and another unnatural amino acid. For example,the cell can further include an additional different O-tRNA/O-RS pairand a second unnatural amino acid, where this additional O-tRNArecognizes a second selector codon and this additional O-RSpreferentially aminoacylates the O-tRNA with the second unnatural aminoacid. For example, a cell, which includes an O-tRNA/O-RS pair (where theO-tRNA recognizes, e.g., an amber selector codon), can further comprisea second orthogonal pair, e.g., leucyl, lysyl, glutamyl, etc., (wherethe second O-tRNA recognizes a different selector codon, e.g., an opal,four-base, or the like).

The O-tRNA and/or the O-RS can be naturally occurring or can be derivedby mutation of a naturally occurring tRNA and/or RS, e.g., whichgenerates libraries of tRNAs and/or libraries of RSs, from a variety oforganisms. For example, one strategy for producing an orthogonaltRNA/aminoacyl-tRNA synthetase pair involves importing a heterologous(to the host cell) tRNA/synthetase pair from, e.g., a source other thanthe host cell, or multiple sources, into the host cell. The propertiesof the heterologous synthetase candidate include, e.g., that it does notcharge any host cell tRNA, and the properties of the heterologous tRNAcandidate include, e.g., that it is not aminoacylated by any host cellsynthetase. In addition, the heterologous tRNA is orthogonal to all hostcell synthetases.

A second strategy for generating an orthogonal pair involves generatingmutant libraries from which to screen and/or select an O-tRNA or O-RS.These strategies can also be combined.

Orthogonal tRNA (O-tRNA)

An orthogonal tRNA (O-tRNA) mediates incorporation of a redox activeamino acid into a protein that is encoded by a polynucleotide thatcomprises a selector codon that is recognized by the O-tRNA, e.g., invivo or in vitro. In certain embodiments, an O-tRNA of the inventionincludes at least about, e.g., a 45%, a 50%, a 60%, a 75%, a 80%, or a90% or more suppression efficiency in the presence of a cognatesynthetase in response to a selector codon as compared to the O-tRNAcomprising or encoded by a polynucleotide sequence as set forth in SEQID NO.: 2.

Suppression efficiency can be determined by any of a number of assaysknown in the art. For example, a galactosidase reporter assay can beused, e.g., a derivatized lacZ plasmid (where the construct has aselector codon n the lacZ nucleic acid sequence) is introduced intocells from an appropriate organism (e.g., an organism where theorthogonal components can be used) along with plasmid comprising anO-tRNA of the invention. A cognate synthetase can also be introduced(either as a polypeptide or a polynucleotide that encodes the cognatesynthetase when expressed). The cells are grown in media to a desireddensity, e.g., to an OD₆₀₀ of about 0.5, and β-galactosidase assays areperformed, e.g., using the BetaFluor™ β-Galactosidase Assay Kit(Novagen). Percent suppression can be calculated as the percentage ofactivity for a sample relative to a comparable control, e.g., the valueobserved from the derivatived lacZ construct, where the construct has acorresponding sense codon at desired position rather than a selectorcodon.

An example of O-tRNAs of the invention is SEQ ID NO.: 2. See Table 1 andExample 2, herein, for sequences of exemplary O-tRNA and O-RS molecules.See also the section entitled “Nucleic Acid and Polypeptide Sequence andVariants” herein. In the tRNA molecule, Thymine (T) is replace withUracil (U). Additional modifications to the bases can also be present.The invention also includes conservative variations of O-tRNA. Forexample, conservative variations of O-tRNA include those molecules thatfunction like the O-tRNA of SEQ ID NO.: 2 and maintain the tRNA L-shapedstructure, but do not have the same sequence (and are other than wildtype tRNA molecules). See also the section herein entitled “Nucleicacids and Polypeptides Sequence and Variants.”

The composition comprising an O-tRNA can further include an orthogonalaminoacyl-tRNA synthetase (O-RS), where the O-RS preferentiallyaminoacylates the O-tRNA with a redox active amino acid. In certainembodiments, a composition including an O-tRNA can further include atranslation system (e.g., in vitro or in vivo). A nucleic acid thatcomprises a polynucleotide that encodes a polypeptide of interest, wherethe polynucleotide comprises a selector codon that is recognized by theO-tRNA, or a combination of one or more of these can also be present inthe cell. See also the section herein entitled “Orthogonalaminoacyl-tRNA synthetases.”

Methods of producing an orthogonal tRNA (O-tRNA) are also a feature ofthe invention. An O-tRNA produced by the method is also a feature of theinvention. In certain embodiments of the invention, the O-tRNAs can beproduced by generating a library of mutants. The library of mutant tRNAscan be generated using various mutagenesis techniques known in the art.For example, the mutant tRNAs can be generated by site-specificmutations, random point mutations, homologous recombination, DNAshuffling or other recursive mutagenesis methods, chimeric constructionor any combination thereof.

Additional mutations can be introduced at a specific position(s), e.g.,at a nonconservative position(s), or at a conservative position, at arandomized position(s), or a combination of both in a desired loop orregion of a tRNA, e.g., an anticodon loop, the acceptor stem, D arm orloop, variable loop, TΩC arm or loop, other regions of the tRNAmolecule, or a combination thereof. Typically, mutations in a tRNAinclude mutating the anticodon loop of each member of the library ofmutant tRNAs to allow recognition of a selector codon. The method canfurther include adding an additional sequence (CCA) to a terminus of theO-tRNA. Typically, an O-tRNA possesses an improvement of orthogonalityfor a desired organism compared to the starting material, e.g., theplurality of tRNA sequences, while preserving its affinity towards adesired RS.

The methods optionally include analyzing the homology of sequences oftRNAs and/or aminoacyl-tRNA synthetases to determine potentialcandidates for an O-tRNA, O-RS and/or pairs thereof, that appear to beorthogonal for a specific organism. Computer programs known in the artand described herein can be used for the analysis, e.g., BLAST andpileup programs can be used. In one example, to choose potentialorthogonal translational components for use in E. coli, a prokaryoticorganism, a synthetase and/or a tRNA is chosen that does not displayunusual homology to prokaryotic organisms.

Typically, an O-tRNA is obtained by subjecting to, e.g., negativeselection, a population of cells of a first species, where the cellscomprise a member of the plurality of potential O-tRNAs. The negativeselection eliminates cells that comprise a member of the library ofpotential O-tRNAs that is aminoacylated by an aminoacyl-tRNA synthetase(RS) that is endogenous to the cell. This provides a pool of tRNAs thatare orthogonal to the cell of the first species.

In certain embodiments, in the negative selection, a selector codon(s)is introduced into polynucleotide that encodes a negative selectionmarker, e.g., an enzyme that confers antibiotic resistance, e.g.,β-lactamase, an enzyme that confers a detectable product, e.g.,β-galactosidase, chloramphenicol acetyltransferase (CAT), e.g., a toxicproduct, such as barnase, at a nonessential position (e.g., stillproducing a functional barnase), etc. Screening/selection is optionallydone by growing the population of cells in the presence of a selectiveagent (e.g., an antibiotic, such as ampicillin). In one embodiment, theconcentration of the selection agent is varied.

For example, to measure the activity of suppressor tRNAs, a selectionsystem is used that is based on the in vivo suppression of selectorcodon, e.g., nonsense or frameshift mutations introduced into apolynucleotide that encodes a negative selection marker, e.g., a genefor lactamase (bla). For example, polynucleotide variants, e.g., blavariants, with a selector codon at a certain position, are constructed.Cells, e.g., bacteria, are transformed with these polynucleotides. Inthe case of an orthogonal tRNA, which cannot be efficiently charged byendogenous E. coli synthetases, antibiotic resistance, e.g., ampicillinresistance, should be about or less than that for a bacteria transformedwith no plasmid. If the tRNA is not orthogonal, or if a heterologoussynthetase capable of charging the tRNA is co-expressed in the system, ahigher level of antibiotic, e.g., ampicillin, resistance is be observed.Cells, e.g., bacteria, are chosen that are unable to grow on LB agarplates with antibiotic concentrations about equal to cells transformedwith no plasmids.

In the case of a toxic product (e.g., ribonuclease or barnase), when amember of the plurality of potential tRNAs is aminoacylated byendogenous host, e.g., Escherichia coli synthetases (i.e., it is notorthogonal to the host, e.g., Escherichia coli synthetases), theselector codon is suppressed and the toxic polynucleotide productproduced leads to cell death. Cells harboring orthogonal tRNAs ornon-functional tRNAs survive.

In one embodiment, the pool of tRNAs that are orthogonal to a desiredorganism are then subjected to a positive selection in which a selectorcodon is placed in a positive selection marker, e.g., encoded by a drugresistance gene, such a β-lactamase gene. The positive selection isperformed on a cell comprising a polynucleotide encoding or comprising amember of the pool of tRNAs that are orthogonal to the cell, apolynucleotide encoding a positive selection marker, and apolynucleotide encoding a cognate RS. In certain embodiments, the secondpopulation of cells comprises cells that were not eliminated by thenegative selection. The polynucleotides are expressed in the cell andthe cell is grown in the presence of a selection agent, e.g.,ampicillin. tRNAs are then selected for, their ability to beaminoacylated by the coexpressed cognate synthetase and to insert anamino acid in response to this selector codon. Typically, these cellsshow an enhancement in suppression efficiency compared to cellsharboring non-functional tRNA(s), or tRNAs that cannot efficiently berecognized by the synthetase of interest. The cell harboring thenon-functional tRNAs or tRNAs that are not efficiently recognized by thesynthetase of interest are sensitive to the antibiotic. Therefore, tRNAsthat: (i) are not substrates for endogenous host, e.g., Escherichiacoli, synthetases; (ii) can be aminoacylated by the synthetase ofinterest; and (iii) are functional in translation survive bothselections.

The stringency of the selection, e.g., the positive selection, thenegative selection or both the positive and negative selection, in theabove described-methods, optionally includes varying the selectionstringency. For example, because barnase is an extremely toxic protein,the stringency of the negative selection can be controlled byintroducing different numbers of selector codons into the barnase geneand/or by using an inducible promoter. In another example, theconcentration of the selection or screening agent is varied (e.g.,ampicillin concentration). In one aspect of the invention, thestringency is varied because the desired activity can be low duringearly rounds. Thus, less stringent selection criteria are applied inearly rounds and more stringent criteria are applied in later rounds ofselection. In certain embodiments, the negative selection, the positiveselection or both the negative and positive selection can be repeatedmultiple times. Multiple different negative selection markers, positiveselection markers or both negative and positive selection markers can beused. In certain embodiments, the positive and negative selection markercan be the same.

Other types of selections/screening can be used in the invention forproducing orthogonal translational components, e.g., an O-tRNA, an O-RS,and an O-tRNA/O-RS pair that utilized a redox active amino acid Forexample, the negative selection marker, the positive selection marker orboth the positive and negative selection markers can include a markerthat fluoresces or catalyzes a luminescent reaction in the presence of asuitable reactant. In another embodiment, a product of the marker isdetected by fluorescence-activated cell sorting (FACS) or byluminescence. Optionally, the marker includes an affinity basedscreening marker. See Francisco, J. A., et al., (1993) Production andfluorescence-activated cell sorting of Escherichia coli expressing afunctional antibody fragment on the external surface. Proc Natl Acad SciUSA. 90:10444-8.

Additional methods for producing a recombinant orthogonal tRNA can befound, e.g., in International patent applications WO 2002/086075,entitled “Methods and compositions for the production of orthogonaltRNA-aminoacyltRNA synthetase pairs;” and, U.S. Ser. No. 60/479,931, and60/496,548 entitled “EXPANDING THE EUKARYOTIC GENETIC CODE.” See alsoForster et al., (2003) Programming peptidomimetic synthetases bytranslating genetic codes designed de novo PNAS 100(11):6353-6357; and,Feng et al., (2003), Expanding tRNA recognition of a tRNA synthetase bya single amino acid change, PNAS 100(10): 5676-5681.

Orthogonal aminoacyl-tRNA Synthetase (O-RS)

An O-RS of the invention preferentially aminoacylates an O-tRNA with aredox active amino acid in vitro or in vivo. An O-RS of the inventioncan be provided to the translation system, e.g., a cell, by apolypeptide that includes an O-RS and/or by a polynucleotide thatencodes an O-RS or a portion thereof. For example, an O-RS comprises anamino acid sequence as set forth in SEQ ID NO.: 1, or a conservativevariation thereof. In another example, an O-RS, or a portion thereof, isencoded by a polynucleotide sequence that encodes an amino acidcomprising SEQ ID NO.: 1, or a complementary polynucleotide sequencethereof. See, e.g., Table 1 and Example 2 herein for sequences ofexemplary O-RS molecules. See also the section entitled “Nucleic Acidand Polypeptide Sequence and Variants” herein.

Methods for identifying an orthogonal aminoacyl-tRNA synthetase (O-RS),e.g., an O-RS, for use with an O-tRNA, are also a feature of theinvention. For example a method includes subjecting to selection, e.g.,positive selection, a population of cells of a first species, where thecells individually comprise: 1) a member of a plurality ofaminoacyl-tRNA synthetases (RSs), (e.g., the plurality of RSs caninclude mutant RSs, RSs derived from a species other than the firstspecies or both mutant RSs and RSs derived from a species other than thefirst species); 2) the orthogonal tRNA (O-tRNA) (e.g., from one or morespecies); and 3) a polynucleotide that encodes a (e.g., positive)selection marker and comprises at least one selector codon. Cells areselected or screened for those that show an enhancement in suppressionefficiency compared to cells lacking or with a reduced amount of themember of the plurality of RSs. Suppression efficiency can be measuredby techniques known in the art and as described herein. Cells having anenhancement in suppression efficiency comprise an active RS thataminoacylates the O-tRNA. A level of aminoacylation (in vitro or invivo) by the active RS of a first set of tRNAs from the first species iscompared to the level of aminoacylation (in vitro or in vivo) by theactive RS of a second set of tRNAs from the second species. The level ofaminoacylation can be determined by a detectable substance (e.g., alabeled amino acid or unnatural amino acid, e.g., a redox active aminoacid such as DHP). The active RS that more efficiently aminoacylates thesecond set of tRNAs compared to the first set of tRNAs is typicallyselected, thereby providing an efficient (optimized) orthogonalaminoacyl-tRNA synthetase for use with the O-tRNA. A O-RS, identified bythe method is also a feature of the invention.

Any of a number of assays can be used to determine aminoacylation. Theseassays can be performed in vitro or in vivo. For example, in vitroaminoacylation assays are described in, e.g., Hoben and Soll (1985)Methods Enzymol. 113:55-59. Aminoacylation can also be determined byusing a reporter along with orthogonal translation components anddetecting the reporter in a cell expressing a polynucleotide comprisingat least one selector codon that encodes a protein. See also WO2002/085923, entitled “IN VIVO INCORPORATION OF UNNATURAL AMINO ACIDS;”and International Application Number PCT/US2004/011786, filed Apr. 16,2004.

O-RS can be manipulated to alter the substrate specificity of thesynthetase so that only a desired unnatural amino acid, e.g., a redoxactive amino acid such as DHP, but not any of the common 20 amino acidsare charged to the O-tRNA. Methods to generate an orthogonal aminoacyltRNA synthetase with a substrate specificity for an unnatural amino acidinclude mutating the synthetase, e.g., at the active site in thesynthetase, at the editing mechanism site in the synthetase, atdifferent sites by combining different domains of synthetases, or thelike, and applying a selection process. A strategy is used, which isbased on the combination of a positive selection followed by a negativeselection. In the positive selection, suppression of the selector codonintroduced at a nonessential position(s), of a positive marker allowscells to survive under positive selection pressure. In the presence ofboth natural and unnatural amino acids, survivors thus encode activesynthetases charging the orthogonal suppressor tRNA with either anatural or unnatural amino acid. In the negative selection, suppressionof a selector codon introduced at a nonessential position(s) of anegative marker removes synthetases with natural amino acidspecificities. Survivors of the negative and positive selection encodesynthetases that aminoacylate (charge) the orthogonal suppressor tRNAwith unnatural amino acids only. These synthetases can then be subjectedto further mutagenesis, e.g., DNA shuffling or other recursivemutagenesis methods.

A library of mutant O-RSs can be generated using various mutagenesistechniques known in the art. For example, the mutant RSs can begenerated by site-specific mutations, random point mutations, homologousrecombination, DNA shuffling or other recursive mutagenesis methods,chimeric construction or any combination thereof. For example, a libraryof mutant RSs can be produced from two or more other, e.g., smaller,less diverse “sub-libraries.” Chimeric libraries of RSs are alsoincluded in the invention. It should be noted that libraries of tRNAsynthetases from various organism (e.g., microorganisms such aseubacteria or archaebacteria) such as libraries that comprise naturaldiversity (see, e.g., U.S. Pat. No. 6,238,884 to Short et al; U.S. Pat.No. 5,756,316 to Schallenberger et al; U.S. Pat. No. 5,783,431 toPetersen et al; U.S. Pat. No. 5,824,485 to Thompson et al; U.S. Pat. No.5,958,672 to Short et al), are optionally constructed and screened fororthogonal pairs.

Once the synthetases are subject to the positive and negativeselection/screening strategy, these synthetases can then be subjected tofurther mutagenesis. For example, a nucleic acid that encodes the O-RScan be isolated; a set of polynucleotides that encode mutated O-RSs(e.g., by random mutagenesis, site-specific mutagenesis, recombinationor any combination thereof) can be generated from the nucleic acid; and,these individual steps or a combination of these steps can be repeateduntil a mutated O-RS is obtained that preferentially aminoacylates theO-tRNA with the unnatural amino acid, e.g., the redox active amino acid.In one aspect of the invention, the steps are performed multiple times,e.g., at least two times.

Additional levels of selection/screening stringency can also be used inthe methods of the invention, for producing O-tRNA, O-RS, or pairsthereof. The selection or screening stringency can be varied on one orboth steps of the method to produce an O-RS. This could include, e.g.,varying the amount of selection/screening agent that is used, etc.Additional rounds of positive and/or negative selections can also beperformed. Selecting or screening can also comprise one or more of achange in amino acid permeability, a change in translation efficiency, achange in translational fidelity, etc. Typically, the one or more changeis based upon a mutation in one or more gene in an organism in which anorthogonal tRNA-tRNA synthetase pair is used to produce protein.

Additional general details for producing O-RS, and altering thesubstrate specificity of the synthetase can be found in WO 2002/086075entitled “Methods and compositions for the production of orthogonaltRNA-aminoacyltRNA synthetase pairs;” and International ApplicationNumber PCT/US2004/011786, filed Apr. 16, 2004.

Source and Host Organisms

The translational components of the invention can be derived fromnon-eukaryotic organisms. For example, the orthogonal O-tRNA can bederived from a non-eukaryotic organism (or a combination of organisms),e.g., an archaebacterium, such as, Methanococcus jannaschii,Methanobacterium thermoautotrophicum, Halobacterium such as Haloferaxvolcanii and Halobacterium species NRC-1, Archaeoglobus fulgidus,Pyrococcus furiosus, Pyrococcus horikoshii, Aeuropyrum pernix,Methanococcus maripaludis, Methanopyrus kandleri, Methanosarcina mazei(Mm), Pyrobaculum aerophilum, Pyrococcus abyssi, Sulfolobus solfataricus(Ss), Sulfolobus tokodaii, Thermoplasma acidophilum, Thermoplasmavolcanium, or the like, or a eubacterium, such as Escherichia coli,Thermus thermophilus, Bacillus stearothermphilus, or the like, while theorthogonal O-RS can be derived from a non-eukaryotic organism (or acombination of organisms), e.g., an archaebacterium, such asMethanococcus jannaschii, Methanobacterium thermoautotrophicum,Halobacterium such as Haloferax volcanii and Halobacterium speciesNRC-1, Archaeoglobus fulgidus, Pyrococcus furiosus, Pyrococcushorikoshii, Aeuropyrum pernix, Methanococcus maripaludis, Methanopyruskandleri, Methanosarcina mazei, Pyrobaculum aerophilum, Pyrococcusabyssi, Sulfolobus solfataricus, Sulfolobus tokodaii, Thermoplasmaacidophilum, Thermoplasma volcanium, or the like, or a eubacterium, suchas Escherichia coli, Thermus thermophilus, Bacillus stearothermphilus,or the like. In one embodiment, eukaryotic sources, e.g., plants, algae,protists, fungi, yeasts, animals (e.g., mammals, insects, arthropods,etc.), or the like, can also be used as sources of O-tRNAs and O-RSs.

The individual components of an O-tRNA/O-RS pair can be derived from thesame organism or different organisms. In one embodiment, the O-tRNA/O-RSpair is from the same organism. Alternatively, the O-tRNA and the O-RSof the O-tRNA/O-RS pair are from different organisms.

The O-tRNA, O-RS or O-tRNA/O-RS pair can be selected or screened in vivoor in vitro and/or used in a cell, e.g., a non-eukaryotic cells, oreukaryotic cells, to produce a polypeptide with a redox active aminoacid. A non-eukaryotic cell can be from a variety of sources, e.g., aeubacterium, such as Escherichia coli, Thermus thermophilus, Bacillusstearothermphilus, or the like, or an archaebacterium, such asMethanococcus jannaschii, Methanobacterium thermoautotrophicum,Halobacterium such as Haloferax volcanii and Halobacterium speciesNRC-1, Archaeoglobus fulgidus, Pyrococcus furiosus, Pyrococcushorikoshii, Aeuropyrum pernix, Methanococcus maripaludis, Methanopyruskandleri, Methanosarcina mazei (Mm), Pyrobaculum aerophilum, Pyrococcusabyssi, Sulfolobus solfataricus (Ss), Sulfolobus tokodaii, Thermoplasmaacidophilum, Thermoplasma volcanium, or the like. A eukaryotic cell canbe from a variety of sources, e.g., a plant (e.g., complex plant such asmonocots, or dicots), an algae, a protist, a fungus, a yeast (e.g.,Saccharomyces cerevisiae), an animal (e.g., a mammal, an insect, anarthropod, etc.), or the like. Compositions of cells with translationalcomponents of the invention are also a feature of the invention.

See also, International Application Number PCT/US2004/011786, filed Apr.16, 2004, for screening O-tRNA and/or O-RS in one species for use inanother species.

Selector Codons

Selector codons of the invention expand the genetic codon framework ofprotein biosynthetic machinery. For example, a selector codon includes,e.g., a unique three base codon, a nonsense codon, such as a stop codon,e.g., an amber codon (UAG), or an opal codon (UGA), an unnatural codon,at least a four base codon, a rare codon, or the like. A number ofselector codons can be introduced into a desired gene, e.g., one ormore, two or more, more than three, etc. By using different selectorcodons, multiple orthogonal tRNA/synthetase pairs can be used that allowthe simultaneous site-specific incorporation of multiple redox activeamino acids, e.g., unnatural amino acids, using these different selectorcodons.

In one embodiment, the methods involve the use of a selector codon thatis a stop codon for the incorporation of a redox active amino acid invivo in a cell. For example, an O-tRNA is produced that recognizes thestop codon and is aminoacylated by an O-RS with a redox active aminoacid. This O-tRNA is not recognized by the naturally occurring host'saminoacyl-tRNA synthetases. Conventional site-directed mutagenesis canbe used to introduce the stop codon at the site of interest in apolynucleotide encoding a polypeptide of interest. See, e.g., Sayers, J.R., et al. (1988), 5′,3′ Exonuclease in phosphorothioate-basedoligonucleotide-directed mutagenesis. Nucleic Acids Res, 791-802. Whenthe O-RS, O-tRNA and the nucleic acid that encodes a polypeptide ofinterest are combined, e.g., in vivo, the redox active amino acid isincorporated in response to the stop codon to give a polypeptidecontaining the redox active amino acid at the specified position. In oneembodiment of the invention, a stop codon used as a selector codon is anamber codon, UAG, and/or an opal codon, UGA. In one example, a geneticcode in which UAG and UGA are both used as a selector codon can encode22 amino acids while preserving the ochre nonsense codon, UAA, which isthe most abundant termination signal.

The incorporation of redox active amino acids in vivo can be donewithout significant perturbation of the host cell. For example innon-eukaryotic cells, such as Escherichia coli, because the suppressionefficiency for the UAG codon depends upon the competition between theO-tRNA, e.g., the amber suppressor tRNA, and the release factor 1 (RF1)(which binds to the UAG codon and initiates release of the growingpeptide from the ribosome), the suppression efficiency can be modulatedby, e.g., either increasing the expression level of O-tRNA, e.g., thesuppressor tRNA, or using an RF1 deficient strain. In eukaryotic cells,because the suppression efficiency for the UAG codon depends upon thecompetition between the O-tRNA, e.g., the amber suppressor tRNA, and aeukaryotic release factor (e.g., eRF) (which binds to a stop codon andinitiates release of the growing peptide from the ribosome), thesuppression efficiency can be modulated by, e.g., increasing theexpression level of O-tRNA, e.g., the suppressor tRNA. In addition,additional compounds can also be present, e.g., reducing agents such asdithiothretiol (DTT).

Redox active amino acids can also be encoded with rare codons. Forexample, when the arginine concentration in an in vitro proteinsynthesis reaction is reduced, the rare arginine codon, AGG, has provento be efficient for insertion of Ala by a synthetic tRNA acylated withalanine. See, e.g., Ma et al., Biochemistry, 32:7939 (1993). In thiscase, the synthetic tRNA competes with the naturally occurring tRNAArg,which exists as a minor species in Escherichia coli. In addition, someorganisms do not use all triplet codons. An unassigned codon AGA inMicrococcus luteus has been utilized for insertion of amino acids in anin vitro transcription/translation extract. See, e.g., Kowal and Oliver,Nucl. Acid Res. 25:4685 (1997). Components of the invention can begenerated to use these rare codons in vivo.

Selector codons can also comprise extended codons, e.g., four or morebase codons, such as, four, five, six or more base codons. Examples offour base codons include, e.g., AGGA, CUAG, UAGA, CCCU, and the like.Examples of five base codons include, e.g., AGGAC, CCCCU, CCCUC, CUAGA,CUACU, UAGGC and the like. Methods of the invention include usingextended codons based on frameshift suppression. Four or more basecodons can insert, e.g., one or multiple unnatural amino acids such as aredox active amino acid, into the same protein. In other embodiments,the anticodon loops can decode, e.g., at least a four-base codon, atleast a five-base codon, or at least a six-base codon or more. Sincethere are 256 possible four-base codons, multiple unnatural amino acidscan be encoded in the same cell using a four or more base codon. Seealso, Anderson et al., (2002) Exploring the Limits of Codon andAnticodon Size, Chemistry and Biology, 9:237-244; and, Magliery, (2001)Expanding the Genetic Code: Selection of Efficient Suppressors ofFour-base Codons and Identification of “Shifty” Four-base Codons with aLibrary Approach in Escherichia coli, J. Mol. Biol. 307: 755-769.

For example, four-base codons have been used to incorporate unnaturalamino acids into proteins using in vitro biosynthetic methods. See,e.g., Ma et al., (1993) Biochemistry, 32:7939; and Hohsaka et al.,(1999) J. Am. Chem. Soc. 121:34. CGGG and AGGU were used tosimultaneously incorporate 2-naphthylalanine and an NBD derivative oflysine into streptavidin in vitro with two chemically acylatedframeshift suppressor tRNAs. See, e.g., Hohsaka et al., (1999) J. Am.Chem. Soc., 121:12194. In an in vivo study, Moore et al. examined theability of tRNA^(Leu) derivatives with NCUA anticodons to suppress UAGNcodons (N can be U, A, G, or C), and found that the quadruplet UAGA canbe decoded by a tRNA^(Leu) with a UCUA anticodon with an efficiency of13 to 26% with little decoding in the 0 or −1 frame. See Moore et al.,(2000) J. Mol. Biol. 298:195. In one embodiment, extended codons basedon rare codons or nonsense codons can be used in invention, which canreduce missense readthrough and frameshift suppression at other unwantedsites.

For a given system, a selector codon can also include one of the naturalthree base codons, where the endogenous system does not use (or rarelyuses) the natural base codon. For example, this includes a system thatis lacking a tRNA that recognizes the natural three base codon, and/or asystem where the three base codon is a rare codon.

Selector codons optionally include unnatural base pairs. These unnaturalbase pairs further expand the existing genetic alphabet. One extra basepair increases the number of triplet codons from 64 to 125. Propertiesof third base pairs include stable and selective base pairing, efficientenzymatic incorporation into DNA with high fidelity by a polymerase, andthe efficient continued primer extension after synthesis of the nascentunnatural base pair. Descriptions of unnatural base pairs which can beadapted for methods and compositions include, e.g., Hirao, et al.,(2002) An unnatural base pair for incorporating amino acid analoguesinto protein, Nature Biotechnology, 20:177-182. See also Wu, Y., et al.,(2002) J. Am. Chem. Soc. 124:14626-14630. Other relevant publicationsare listed below.

For in vivo usage, the unnatural nucleoside is membrane permeable and isphosphorylated to form the corresponding triphosphate. In addition, theincreased genetic information is stable and not destroyed by cellularenzymes. Previous efforts by Benner and others took advantage ofhydrogen bonding patterns that are different from those in canonicalWatson-Crick pairs, the most noteworthy example of which is theiso-C:iso-G pair. See, e.g., Switzer et al., (1989) J. Am. Chem. Soc.,111:8322; and Piccirilli et al., (1990) Nature, 343:33; Kool, (2000)Curr. Opin. Chem. Biol. 4:602. These bases in general mispair to somedegree with natural bases and cannot be enzymatically replicated. Kooland co-workers demonstrated that hydrophobic packing interactionsbetween bases can replace hydrogen bonding to drive the formation ofbase pair. See Kool, (2000) Curr. Opin. Chem. Biol. 4:602; and Guckianand Kool, (1998) Angew. Chem. Int. Ed. Engl., 36, 2825. In an effort todevelop an unnatural base pair satisfying all the above requirements,Schultz, Romesberg and co-workers have systematically synthesized andstudied a series of unnatural hydrophobic bases. A PICS:PICS self-pairis found to be more stable than natural base pairs, and can beefficiently incorporated into DNA by Klenow fragment of Escherichia coliDNA polymerase I (KF). See, e.g., McMinn et al., (1999) J. Am. Chem.Soc., 121:11586; and Ogawa et al., (2000) J. Am. Chem. Soc., 122:3274. A3MN:3MN self-pair can be synthesized by KF with efficiency andselectivity sufficient for biological function. See, e.g., Ogawa et al.,(2000) J. Am. Chem. Soc. 122:8803. However, both bases act as a chainterminator for further replication. A mutant DNA polymerase has beenrecently evolved that can be used to replicate the PICS self pair. Inaddition, a 7AI self pair can be replicated. See, e.g., Tae et al.,(2001) J. Am. Chem. Soc., 123:7439. A novel metallobase pair, Dipic:Py,has also been developed, which forms a stable pair upon binding Cu(II).See Meggers et al., (2000) J. Am. Chem. Soc. 122:10714. Because extendedcodons and unnatural codons are intrinsically orthogonal to naturalcodons, the methods of the invention can take advantage of this propertyto generate orthogonal tRNAs for them.

A translational bypassing system can also be used to incorporate a redoxactive amino acid in a desired polypeptide. In a translational bypassingsystem, a large sequence is inserted into a gene but is not translatedinto protein. The sequence contains a structure that serves as a cue toinduce the ribosome to hop over the sequence and resume translationdownstream of the insertion.

Unnatural Amino Acids

As used herein, an unnatural amino acid refers to any amino acid,modified amino acid, or amino acid analogue other than selenocysteineand/or pyrrolysine and the following twenty genetically encodedalpha-amino acids: alanine, arginine, asparagine, aspartic acid,cysteine, glutamine, glutamic acid, glycine, histidine, isoleucine,leucine, lysine, methionine, phenylalanine, proline, serine, threonine,tryptophan, tyrosine, valine. The generic structure of an alpha-aminoacid is illustrated by Formula I:

An unnatural amino acid is typically any, structure having Formula Iwherein the R group is any substituent other than one used in the twentynatural amino acids. See e.g., Biochemistry by L. Stryer, 3^(rd) ed.1988, Freeman and Company, New York, for structures of the twentynatural amino acids. Note that, the unnatural amino acids of theinvention can be naturally occurring compounds other than the twentyalpha-amino acids above.

Because the unnatural amino acids of the invention typically differ fromthe natural amino acids in side chain, the unnatural amino acids formamide bonds with other amino acids, e.g., natural or unnatural, in thesame manner in which they are formed in naturally occurring proteins.However, the unnatural amino acids have side chain groups thatdistinguish them from the natural amino acids.

Of particular interest in incorporating unnatural amino acids intoproteins is to have the ability to incorporate a redox active aminoacid, e.g., an unnatural amino acid which comprises a moiety whichallows electron and/or proton transferring in and out of the molecule,into proteins. For example, in a redox active amino acid, R in Formula Iincludes, but is not limited to, e.g., keto-, azido-, hydroxyl-, halo-(e.g., iodo-), nitro-, thiol-, seleno-, sulfonyl-, heterocyclic,aldelhyde, thioacid, and the like, or any combination thereof. Examplesof redox active amino acids of the invention include, but are notlimited to, e.g., 3,4-dihydroxy-L-phenyalanine (DHP),3,4,6-trihydroxy-L-phenylalanine, 3,4,5-trihydroxy-L-phenylalanine,3-nitro-tyrosine, 4-nitro-phenylalanine, 3-thiol-tyrosine, and the like.See also FIG. 1.

In other unnatural amino acids, for example, R in Formula I optionallycomprises an alkyl-, aryl-, acyl-, hydrazine, cyano-, halo-, hydrazide,alkenyl, alkynyl, ether, borate, boronate, phospho, phosphono,phosphine, enone, imine, ester, hydroxylamine, amine, and the like, orany combination thereof. Other unnatural amino acids of interestinclude, but are not limited to, amino acids comprising aphotoactivatable cross-linker, spin-labeled amino acids, fluorescentamino acids, metal binding amino acids, metal-containing amino acids,radioactive amino acids, amino acids with novel functional groups, aminoacids that covalently or noncovalently interact with other molecules,photocaged and/or photoisomerizable amino acids, biotin orbiotin-analogue containing amino acids, keto containing amino acids,glycosylated amino acids, a saccharide moiety attached to the amino acidside chain, amino acids comprising polyethylene glycol or polyether,heavy atom substituted amino acids, chemically cleavable orphotocleavable amino acids, amino acids with an elongated side chain ascompared to natural amino acids (e.g., polyethers or long chainhydrocarbons, e.g., greater than about 5, greater than about 10 carbons,etc.), carbon-linked sugar-containing amino acids, amino thioacidcontaining amino acids, and amino acids containing one or more toxicmoiety.

In addition to unnatural amino acids that contain novel side chains,unnatural amino acids also optionally comprise modified backbonestructures, e.g., as illustrated by the structures of Formula II andIII:

wherein Z typically comprises OH, NH₂, SH, NH—R′, or S—R′; X and Y,which can be the same or different, typically comprise S or O, and R andR′, which are optionally the same or different, are typically selectedfrom the same list of constituents for the R group described above forthe unnatural amino acids having Formula I as well as hydrogen. Forexample, unnatural amino acids of the invention optionally comprisesubstitutions in the amino or carboxyl group as illustrated by FormulasII and III. Unnatural amino acids of this type include, but are notlimited to, α-hydroxy acids, α-thioacids α-aminothiocarboxylates, e.g.,with side chains corresponding to the common twenty natural amino acidsor unnatural side chains. In addition, substitutions at the α-carbonoptionally include L, D, or α-α-disubstituted amino acids such asD-glutamate, D-alanine, D-methyl-O-tyrosine, aminobutyric acid, and thelike. Other structural alternatives include cyclic amino acids, such asproline analogues as well as 3, 4, 6, 7, 8, and 9 membered ring prolineanalogues, β and γ amino acids such as substituted β-alanine and γ-aminobutyric acid.

For example, many unnatural amino acids are based on natural aminoacids, such as tyrosine, glutamine, phenylalanine, and the like.Tyrosine analogs include para-substituted tyrosines, ortho-substitutedtyrosines, and meta substituted tyrosines, wherein the substitutedtyrosine comprises an acetyl group, a benzoyl group, an amino group, ahydrazine, an hydroxyamine, a thiol group, a carboxy group, an isopropylgroup, a methyl group, a C₆-C₂₀ straight chain or branched hydrocarbon,a saturated or unsaturated hydrocarbon, an O-methyl group, a polyethergroup, a nitro group, or the like. In addition, multiply substitutedaryl rings are also contemplated. Glutamine analogs of the inventioninclude, but are not limited to, α-hydroxy derivatives, γ-substitutedderivatives, cyclic derivatives, and amide substituted glutaminederivatives. Example phenylalanine analogs include, but are not limitedto, para-substituted phenylalanines, ortho-substituted phenyalanines,and meta-substituted phenylalanines, wherein the substituent comprises ahydroxy group, a methoxy group, a methyl group, an alkyl group, analdehyde, a nitro, a thiol group, or keto group, or the like. Specificexamples of unnatural amino acids include, but are not limited to, a3,4-dihydroxy-L-phenyalanine (DHP), a 3,4,6-trihydroxy-L-phenylalanine,a 3,4,5-trihydroxy-L-phenylalanine, 4-nitro-phenylalanine, a p-acetyl-Lphenylalanine, a p-propargyloxyphenylalanine, O-methyl-L-tyrosine, anL-3-(2-naphthyl)alanine, a 3-methyl-phenylalanine, anO-4-alkyl-L-tyrosine, a 4-propyl-L-tyrosine, a 3-nitro-tyrosine, a3-thiol-tyrosine, a tri-O-acetyl-GlcNAcβ-serine, an L-Dopa, afluorinated phenylalanine, an isopropyl-L-phenylalanine, ap-azido-L-phenylalanine, a p-acyl-L-phenylalanine, ap-benzoyl-L-phenylalanine, an L-phosphoserine, a phosphonoserine, aphosphonotyrosine, a p-iodo-phenylalanine, a p-bromophenylalanine, ap-amino-L-phenylalanine, and an isopropyl-L-phenylalanine, and the like.The structures of a variety of unnatural amino acids are provided in,for example, FIG. 1 herein and FIGS. 16, 17, 18, 19, 26, and 29 of WO2002/085923 entitled “In vivo incorporation of unnatural amino acids.”

Chemical Synthesis of Unnatural Amino Acids

Many of the unnatural amino acids provided above are commerciallyavailable, e.g., from Sigma (USA) or Aldrich (Milwaukee, Wis., USA).Those that are not commercially available are optionally synthesized asprovided in various publications or using standard methods known tothose of skill in the art. For organic synthesis techniques, see, e.g.,Organic Chemistry by Fessendon and Fessendon, (1982, Second Edition,Willard Grant Press, Boston Mass.); Advanced Organic Chemistry by March(Third Edition, 1985, Wiley and Sons, New York); and Advanced OrganicChemistry by Carey and Sundberg (Third Edition, Parts A and B, 1990,Plenum Press, New York). Additional publications describing thesynthesis of unnatural amino acids include, e.g., WO 20021085923entitled “In vivo incorporation of Unnatural Amino Acids;” Matsoukas etal., (1995) J. Med. Chem. 38, 4660-4669; King, F. E. & Kidd, D. A. A.(1949) A New Synthesis of Glutamine and of γ-Dipeptides of Glutamic Acidfrom Phthylated Intermediates. J. Chem. Soc., 3315-3319; Friedman, O. M.& Chatterji, R. (1959) Synthesis of Derivatives of Glutamine as ModelSubstrates for Anti-Tumor Agents. J. Am. Chem. Soc. 81, 3750-3752;Craig, J. C. et al. (1988) Absolute Configuration of the Enantiomers of7-Chloro-4[[4-(diethylamino)-1-methylbutyl]amino]quinoline(Chloroquine). J. Org. Chem 53, 1167-1170; Azoulay, M., Vilmont, M. &Frappier, F. (1991) Glutamine analogues as Potential Antimalarials,.Eur. J. Med. Chem. 26, 201-5; Koskinen, A. M. P. & Rapoport, H. (1989).Synthesis of 4-Substituted Prolines as Conformationally ConstrainedAmino Acid Analogues. J. Org. Chem. 54, 1859-1866; Christie, B. D. &Rapoport, H. (1985) Synthesis of Optically Pure Pipecolates fromL-Asparagine. Application to the Total Synthesis of (+)-Apovincaminethrough Amino Acid Decarbonylation and Iminium Ion Cyclization. J. Org.Chem. 1989:1859-1866; Barton et al., (1987) Synthesis of Novela-Amino-Acids and Derivatives Using Radical Chemistry: Synthesis of L-and D-α-Amino-Adipic Acids, L-α-aminopimelic Acid and AppropriateUnsaturated Derivatives. Tetrahedron Lett. 43:4297-4308; and, Subasingheet al., (1992) Quisqualic acid analogues: synthesis of beta-heterocyclic2-aminopropanoic acid derivatives and their activity at a novelquisqualate-sensitized site. J. Med. Chem. 35:4602-7. See alsoInternational Application Number PCT/US03/41346, entitled “ProteinArrays,” filed on Dec. 22, 2003.

Cellular Uptake of Unnatural Amino Acids

Unnatural amino acid uptake by a cell is one issue that is typicallyconsidered when designing and selecting unnatural amino acids, e.g., forincorporation into a protein. For example, the high charge density ofα-amino acids suggests that these compounds are unlikely to be cellpermeable. Natural amino acids are taken up into the cell via acollection of protein-based transport systems often displaying varyingdegrees of amino acid specificity. A rapid screen can be done whichassesses which unnatural amino acids, if any, are taken up by cells.See, e.g., the toxicity assays in, e.g., International ApplicationNumber PCT/US03/41346, entitled “Protein Arrays,” filed on Dec. 22,2003; and Liu and Schultz (1999) Progress toward the evolution of anorganism with an expanded genetic code. PNAS 96:4780-4785. Althoughuptake is easily analyzed with various assays, an alternative todesigning unnatural amino acids that are amenable to cellular uptakepathways is to provide biosynthetic pathways to create amino acids invivo.

Biosynthesis of Unnatural Amino Acids

Many biosynthetic pathways already exist in cells for the production ofamino acids and other compounds. While a biosynthetic method for aparticular unnatural amino acid may not exist in nature, e.g., in acell, the invention provides such methods. For example, biosyntheticpathways for unnatural amino acids are optionally generated in host cellby adding new enzymes or modifying existing host cell pathways.Additional new enzymes are optionally naturally occurring enzymes orartificially evolved enzymes. For example, the biosynthesis ofp-aminophenylalanine (as presented in an example in WO 2002/085923,supra) relies on the addition of a combination of known enzymes fromother organisms. The genes for these enzymes can be introduced into acell by transforming the cell with a plasmid comprising the genes. Thegenes, when expressed in the cell, provide an enzymatic pathway tosynthesize the desired compound. Examples of the types of enzymes thatare optionally added are provided in the examples below. Additionalenzymes sequences are found, e.g., in Genbank. Artificially evolvedenzymes are also optionally added into a cell in the same manner. Inthis manner, the cellular machinery and resources of a cell aremanipulated to produce unnatural amino acids.

Indeed, any of a variety of methods can be used for producing novelenzymes for use in biosynthetic pathways, or for evolution of existingpathways, for the production of unnatural amino acids, in vitro or invivo. Many available methods of evolving enzymes and other biosyntheticpathway components can be applied to the present invention to produceunnatural amino acids (or, indeed, to evolve synthetases to have newsubstrate specificities or other activities of interest). For example,DNA shuffling is optionally used to develop novel enzymes and/orpathways of such enzymes for the production of unnatural amino acids (orproduction of new synthetases), in vitro or in vivo. See, e.g., Stemmer(1994), Rapid evolution of a protein in vitro by DNA shuffling, Nature370(4):389-391; and, Stemmer, (1994), DNA shuffling by randomfragmentation and reassembly: In vitro recombination for molecularevolution, Proc. Natl. Acad. Sci. USA., 91:10747-10751. A relatedapproach shuffles families of related (e.g., homologous) genes toquickly evolve enzymes with desired characteristics. An example of such“family gene shuffling” methods is found in Crameri et al. (1998) “DNAshuffling of a family of genes from diverse species accelerates directedevolution” Nature, 391(6664): 288-291. New enzymes (whether biosyntheticpathway components or synthetases) can also be generated using a DNArecombination procedure known as “incremental truncation for thecreation of hybrid enzymes” (“ITCHY”), e.g., as described in Ostermeieret al. (1999) “A combinatorial approach to hybrid enzymes independent ofDNA homology” Nature Biotech 17:1205. This approach can also be used togenerate a library of enzyme or other pathway variants which can serveas substrates for one or more in vitro or in vivo recombination methods.See, also, Ostermeier et al. (1999) “Combinatorial Protein Engineeringby Incremental Truncation,” Proc. Natl. Acad. Sci. USA, 96: 3562-67, andOstermeier et al. (1999), “Incremental Truncation as a Strategy in theEngineering of Novel Biocatalysts,” Biological and Medicinal Chemistry,7: 2139-44. Another approach uses exponential ensemble mutagenesis toproduce libraries of enzyme or other pathway variants that are, e.g.,selected for an ability to catalyze a biosynthetic reaction relevant toproducing an unnatural amino acid (or a new synthetase). In thisapproach, small groups of residues in a sequence of interest arerandomized in parallel to identify, at each altered position, aminoacids which lead to functional proteins. Examples of such procedures,which can be adapted to the present invention to produce new enzymes forthe production of unnatural amino acids (or new synthetases) are foundin Delegrave & Youvan (1993) Biotechnology Research 11:1548-1552. In yetanother approach, random or semi-random mutagenesis using doped ordegenerate oligonucleotides for enzyme and/or pathway componentengineering can be used, e.g., by using the general mutagenesis methodsof e.g., Arkin and Youvan (1992) “Optimizing nucleotide mixtures toencode specific subsets of amino acids for semi-random mutagenesis”Biotechnology 10:297-300; or Reidhaar-Olson et al. (1991) “Randommutagenesis of protein sequences using oligonucleotide cassettes”Methods Enzymol. 208:564-86. Yet another approach, often termed a“non-stochastic” mutagenesis, which uses polynucleotide reassembly andsite-saturation mutagenesis can be used to produce enzymes and/orpathway components, which can then be screened for an ability to performone or more synthetase or biosynthetic pathway function (e.g., for theproduction of unnatural amino acids in vivo). See, e.g., Short“Non-Stochastic Generation of Genetic Vaccines and Enzymes” WO 00/46344.

An alternative to such mutational methods involves recombining entiregenomes of organisms and selecting resulting progeny for particularpathway functions (often referred to as “whole genome shuffling”). Thisapproach can be applied to the present invention, e.g., by genomicrecombination and selection of an organism (e.g., an E. coli or othercell) for an ability to produce an unnatural amino acid (or intermediatethereof). For example, methods taught in the following publications canbe applied to pathway design for the evolution of existing and/or newpathways in cells to produce unnatural amino acids in vivo: Patnaik etal. (2002) “Genome shuffling of lactobacillus for improved acidtolerance” Nature Biotechnology, 20(7): 707-712; and Zhang et al. (2002)“Genome shuffling leads to rapid phenotypic improvement in bacteria”Nature, February 7, 415(6872): 644-646.

Other techniques for organism and metabolic pathway engineering, e.g.,for the production of desired compounds are also available and can alsobe applied to the production of unnatural amino acids. Examples ofpublications teaching useful pathway engineering approaches include:Nakamura and White (2003) “Metabolic engineering for the microbialproduction of 1,3 propanediol” Curr. Opin. Biotechnol. 14(5):454-9;Berry et al. (2002) “Application of Metabolic Engineering to improveboth the production and use of Biotech Indigo” J. IndustrialMicrobiology and Biotechnology 28:127-133; Banta et al. (2002)“Optimizing an artificial metabolic pathway: Engineering the cofactorspecificity of Corynebacterium 2,5-diketo-D-gluconic acid reductase foruse in vitamin C biosynthesis” Biochemistry; 41(20), 6226-36; Selivonovaet al. (2001) “Rapid Evolution of Novel Traits in Microorganisms”Applied and Environmental Microbiology, 67:3645, and many others.

Regardless of the method used, typically, the unnatural amino acidproduced with an engineered biosynthetic pathway of the invention isproduced in a concentration sufficient for efficient proteinbiosynthesis, e.g., a natural cellular amount, but not to such a degreeas to significantly affect the concentration of other cellular aminoacids or to exhaust cellular resources. Typical concentrations producedin vivo in this manner are about 10 mM to about 0.05 mM. Once a cell isengineered to produce enzymes desired for a specific pathway and anunnatural amino acid is generated, in vivo selections are optionallyused to further optimize the production of the unnatural amino acid forboth ribosomal protein synthesis and cell growth.

Orthogonal Components for Incorporating 3,4-dihydroxy-L-phenylalanine(DHP)

The invention provides compositions and methods of producing orthogonalcomponents for incorporating a redox-active amino acid; e.g.,3,4-dihydroxy-L-phenylalanine (DHP), into a growing polypeptide chain inresponse to a selector codon, e.g., stop codon, a nonsense codon, a fouror more base codon, etc., e.g., in vivo. For example, the inventionprovides orthogonal-tRNAs (O-tRNAs), orthogonal aminoacyl-tRNAsynthetases (O-RSs) and pairs thereof. These pairs can be used toincorporate DHP into growing polypeptide chains.

A composition of the invention includes an orthogonal aminoacyl-tRNAsynthetase (O-RS), where the O-RS preferentially aminoacylates an O-tRNAwith a DHP. In certain embodiments, the O-RS comprises an amino acidsequence comprising SEQ ID NO.: 1, or a conservative variation thereof.In certain embodiments of the invention, the O-RS preferentiallyaminoacylates the O-tRNA with a redox-active amino acid, where the O-RShas an efficiency of at least 50% of the efficiency of a polypeptidecomprising an amino acid sequence of SEQ ID NO.: 1.

A composition that includes an O-RS can optionally further include anorthogonal tRNA (O-tRNA), where the O-tRNA recognizes a selector codon.Typically, an O-tRNA of the invention includes at least about, e.g., a45%, a 50%, a 60%, a 75%, an 80%, or a 90% or more suppressionefficiency in the presence of a cognate synthetase in response to aselector codon as compared to the O-tRNA comprising or encoded by apolynucleotide sequence as set forth in the sequence listings andexamples herein. In one embodiment, the suppression efficiency of theO-RS and the O-tRNA together is, e.g., 5 fold, 10 fold, 15 fold, 20fold, 25 fold or more greater than the suppression efficiency of theO-tRNA lacking the O-RS. In one aspect, the suppression efficiency ofthe O-RS and the O-tRNA together is at least 45% of the suppressionefficiency of an orthogonal tyrosyl-tRNA synthetase pair derived fromMethanococcus jannaschii.

A composition that includes an O-tRNA can optionally include a cell(e.g., a non-eukaryotic cell, such as an E. coli cell and the like, or aeukaryotic cell), and/or a translation system.

A cell (e.g., a non-eukaryotic cell, or a eukaryotic cell) comprising atranslation system is also provided by the invention, where thetranslation system includes an orthogonal-tRNA (O-tRNA); an orthogonalaminoacyl-tRNA synthetase (O-RS); and, a redox active amino acid, e.g.,3,4-dihydroxy-L-phenylalanine (DHP). Typically, the O-RS preferentiallyaminoacylates the O-tRNA with an efficiency of at least 50% of theefficiency of a polypeptide comprising an amino acid sequence of SEQ IDNO.: 1. The O-tRNA recognizes the first selector codon, and the O-RSpreferentially aminoacylates the O-tRNA with the3,4-dihydroxy-L-phenylalanine (DHP). In one embodiment, the O-tRNAcomprises or is encoded by a polynucleotide sequence as set forth in SEQID NO.: 2, or a complementary polynucleotide sequence thereof. In oneembodiment, the O-RS comprises an amino acid sequence as set forth inany one of SEQ ID NO.: 1, or a conservative variation thereof.

A cell of the invention can optionally further comprise an additionaldifferent O-tRNA/O-RS pair and a second unnatural amino acid, e.g.,where this O-tRNA recognizes a second selector codon and this O-RSpreferentially aminoacylates the O-tRNA with the second unnatural aminoacid amino acid. Optionally, a cell of the invention includes a nucleicacid that comprises a polynucleotide that encodes a polypeptide ofinterest, where the polynucleotide comprises a selector codon that isrecognized by the O-tRNA.

In certain embodiments, a cell of the invention includes an E. coli cellthat includes an orthogonal-tRNA. (O-tRNA), an orthogonal aminoacyl-tRNAsynthetase (O-RS), a redox-active amino acid, and a nucleic acid thatcomprises a polynucleotide that encodes a polypeptide of interest, wherethe polynucleotide comprises the selector codon that is recognized bythe O-tRNA. In certain embodiments of the invention, the O-RSpreferentially aminoacylates the O-tRNA with an efficiency of at least50% of the efficiency of a polypeptide comprising an amino acid sequenceof any listed O-RS sequence herein.

In certain embodiments of the invention, an O-tRNA of the inventioncomprises or is encoded by a polynucleotide sequence as set forth in thesequence listings or examples herein, or a complementary polynucleotidesequence thereof. In certain embodiments of the invention, an O-RScomprises an amino acid sequence as set forth in the sequence listings,or a conservative variation thereof. In one embodiment, the O-RS or aportion thereof is encoded by a polynucleotide sequence encoding anamino acid as set forth in the sequence listings or examples herein, ora complementary polynucleotide sequence thereof.

The O-tRNA and/or the O-RS of the invention can be derived from any of avariety of organisms (e.g., eukaryotic and/or non-eukaryotic organisms).

Polynucleotides are also a feature of the invention. A polynucleotide ofthe invention includes an artificial (e.g., man-made, and not naturallyoccurring) polynucleotide comprising a nucleotide sequence encoding apolypeptide as set forth in the sequence listings herein, and/or iscomplementary to or that polynucleotide sequence. A polynucleotide ofthe invention can also includes a nucleic acid that hybridizes to apolynucleotide described above, under highly stringent conditions, oversubstantially the entire length of the nucleic acid. A polynucleotide ofthe invention also includes a polynucleotide that is, e.g., at least75%, at least 80%, at least 90%, at least 95%, at least 98% or moreidentical to that of a naturally occurring tRNA or corresponding codingnucleic acid (but a polynucleotide of the invention is other than anaturally occurring tRNA or corresponding coding nucleic acid), wherethe tRNA recognizes a selector codon, e.g., a four base-codon.Artificial polynucleotides that are, e.g., at least 80%, at least 90%,at least 95%, at least 98% or more identical to any of the above and/ora polynucleotide comprising a conservative variation of any the above,are also included in polynucleotides of the invention.

Vectors comprising a polynucleotide of the invention are also a featureof the invention. For example, a vector of the invention can include aplasmid, a cosmid, a phage, a virus, an expression vector, and/or thelike. A cell comprising a vector of the invention is also a feature ofthe invention.

Methods of producing components of an O-tRNA/O-RS pair are also featuresof the invention. Components produced by these methods are also afeature of the invention. For example, methods of producing at least onetRNA that are orthogonal to a cell (O-tRNA) include generating a libraryof mutant tRNAs; mutating an anticodon loop of each member of thelibrary of mutant tRNAs to allow recognition of a selector codon,thereby providing a library of potential O-tRNAs, and subjecting tonegative selection a first population of cells of a first species, wherethe cells comprise a member of the library of potential O-tRNAs. Thenegative selection eliminates cells that comprise a member of thelibrary of potential O-tRNAs that is aminoacylated by an aminoacyl-tRNAsynthetase (RS) that is endogenous to the cell. This provides a pool oftRNAs that are orthogonal to the cell of the first species, therebyproviding at least one O-tRNA. An O-tRNA produced by the methods of theinvention is also provided.

In certain embodiments, the methods further comprise subjecting topositive selection a second population of cells of the first species,where the cells comprise a member of the pool of tRNAs that areorthogonal to the cell of the first species, a cognate aminoacyl-tRNAsynthetase, and a positive selection marker. Using the positiveselection, cells are selected or screened for those cells that comprisea member of the pool of tRNAs that is aminoacylated by the cognateaminoacyl-tRNA synthetase and that shows a desired response in thepresence of the positive selection marker, thereby providing an O-tRNA.In certain embodiments, the second population of cells comprise cellsthat were not eliminated by the negative selection.

Methods for identifying an orthogonal-aminoacyl-tRNA synthetase thatcharges an O-tRNA with a redox active amino acid are also provided. Forexample, methods include subjecting a population of cells of a firstspecies to a selection, where the cells each comprise: 1) a member of aplurality of aminoacyl-tRNA synthetases (RSs), (e.g., the plurality ofRSs can include mutant RSs, RSs derived from a species other than afirst species or both mutant RSs and RSs derived from a species otherthan a first species); 2) the orthogonal-tRNA (O-tRNA) (e.g., from oneor more species); and 3) a polynucleotide that encodes a positiveselection marker and comprises at least one selector codon.

Cells (e.g., a host cell) are selected or screened for those that showan enhancement in suppression efficiency compared to cells lacking orhaving a reduced amount of the member of the plurality of RSs. Theseselected/screened cells comprise an active RS that aminoacylates theO-tRNA. An orthogonal aminoacyl-tRNA synthetase identified by the methodis also a feature of the invention.

Methods of producing a protein in a cell (e.g., a non-eukaryotic cell,such as an E. coli cell or the like, or a eukaryotic cell) with a3,4-dihydroxy-L-phenylalanine (SP) at a specified position are also afeature of the invention. For example, a method includes growing, in anappropriate medium, a cell, where the cell comprises a nucleic acid thatcomprises at least one selector codon and encodes a protein, providingthe DHP, and incorporating the DHP into the specified position in theprotein during translation of the nucleic acid with the at least oneselector codon, thereby producing the protein. The cell furthercomprises: an orthogonal-tRNA (O-tRNA) that functions in the cell andrecognizes the selector codon; and, an orthogonal aminoacyl-tRNAsynthetase (O-RS) that preferentially aminoacylates the O-tRNA with theDHP. A protein produced by this method is also a feature of theinvention.

The invention also provides compositions that include proteins, wherethe proteins comprise, e.g., a DHP. In certain embodiments, the proteincomprises an amino acid sequence that is at least 75% identical to thatof a known protein, e.g., a therapeutic protein, a diagnostic protein,an industrial enzyme, or portion thereof. Optionally, the compositioncomprises a pharmaceutically acceptable carrier.

Nucleic Acid and Polypeptide Sequence and Variants

As described above and below, the invention provides for nucleic acidpolynucleotide sequences encoding, e.g., O-tRNAs and O-RSs, andpolypeptide amino acid sequences, e.g., O-RSs, and, e.g., compositions,systems and methods comprising said sequences. Examples of saidsequences, e.g., O-tRNA and O-RS amino acid and nucleotide sequences aredisclosed herein (see Table 1, e.g., SEQ ID NOS: 1 through 3). However,one of skill in the art will appreciate that the invention is notlimited to those sequences disclosed herein, e.g., as in the Examplesand sequence listing. One of skill will appreciate that the inventionalso provides e.g., many and unrelated sequences with the functionsdescribed herein, e.g., encoding an O-tRNA or an O-RS.

The invention provides polypeptides (O-RSs) and polynucleotides, e.g.,O-tRNA, polynucleotides that encode O-RSs or portions thereof,oligonucleotides used to isolate aminoacyl-tRNA synthetase clones, etc.Polynucleotides of the invention include those that encode proteins orpolypeptides of interest of the invention with one or more selectorcodon. In addition, polynucleotides of the invention include, e.g., apolynucleotide comprising a nucleotide sequence as set forth in SEQ IDNO.: 2; a polynucleotide that is complementary to or that encodes apolynucleotide sequence thereof. A polynucleotide of the invention alsoincludes a polynucleotide that encodes an amino acid sequence comprisingSEQ ID NO.: 1. A polynucleotide of the invention also includes apolynucleotide that encodes a polypeptide of the invention. Similarly,an artificial nucleic acid that hybridizes to a polynucleotide indicatedabove under highly stringent conditions over substantially the entirelength of the nucleic acid (and is other than a naturallypolynucleotide) is a polynucleotide of the invention. In one embodiment,a composition includes a polypeptide of the invention and an excipient(e.g., buffer, water, pharmaceutically acceptable excipient, etc.). Theinvention also provides an antibody or antisera specificallyimmunoreactive with a polypeptide of the invention. An artificialpolynucleotide is a polynucleotide that is man made and is not naturallyoccurring.

A polynucleotide of the invention also includes an artificialpolynucleotide that is, e.g., at least 75%, at least 80%, at least 90%,at least 95%, at least 98% or more identical to that of a naturallyoccurring tRNA, (but is other than a naturally occurring tRNA. Apolynucleotide also includes an artificial polynucleotide that is, e.g.,at least 75%, at least 80%, at least 90%, at least 95%, at least 98% ormore identical to that of a naturally occurring tRNA.

In certain embodiments, a vector (e.g., a plasmid, a cosmid, a phage, avirus, etc.) comprises a polynucleotide of the invention. In oneembodiment, the vector is an expression vector. In another embodiment,the expression vector includes a promoter operably linked to one or moreof the polynucleotides of the invention. In another embodiment, a cellcomprises a vector that includes a polynucleotide of the invention.

One of skill will also appreciate that many variants of the disclosedsequences are included in the invention. For example, conservativevariations of the disclosed sequences that yield a functionallyidentical sequence are included in the invention; Variants of thenucleic acid polynucleotide sequences, wherein the variants hybridize toat least one disclosed sequence, are considered to be included in theinvention. Unique subsequences of the sequences disclosed herein, asdetermined by, e.g., standard sequence comparison techniques, are alsoincluded in the invention.

Conservative Variations

Owing to the degeneracy of the genetic code, “silent substitutions”(i.e., substitutions in a nucleic acid sequence which do not result inan alteration in an encoded polypeptide) are an implied feature of everynucleic acid sequence which encodes an amino acid. Similarly,“conservative amino acid substitutions,” in one or a few amino acids inan amino acid sequence are substituted with different amino acids withhighly similar properties, are also readily identified as being highlysimilar to a disclosed construct. Such conservative variations of eachdisclosed sequence are a feature of the present invention.

“Conservative variations” of a particular nucleic acid sequence refersto those nucleic acids which encode identical or essentially identicalamino acid sequences, or, where the nucleic acid does not encode anamino acid sequence, to essentially identical sequences. One of skillwill recognize that individual substitutions, deletions or additionswhich alter, add or delete a single amino acid or a small percentage ofamino acids (typically less than 5%, more typically less than 4%, 2% or1%) in an encoded sequence are “conservatively modified variations”where the alterations result in the deletion of an amino acid, additionof an amino acid, or substitution of an amino acid with a chemicallysimilar amino acid. Thus, “conservative variations” of a listedpolypeptide sequence of the present invention include substitutions of asmall percentage, typically less than 5%, more typically less than 2% or1%, of the amino acids of the polypeptide sequence, with aconservatively redox active amino acid of the same conservativesubstitution group. Finally, the addition of sequences which do notalter the encoded activity of a nucleic acid molecule, such as theaddition of a non-functional sequence, is a conservative variation ofthe basic nucleic acid.

Conservative substitution tables providing functionally similar aminoacids are well known in the art, where one amino acid residue issubstituted for another amino acid residue having similar chemicalproperties (e.g., aromatic side chains or positively charged sidechains), and therefore does not substantially change the functionalproperties of the polypeptide molecule. The following sets forth examplegroups that contain natural amino acids of like chemical properties,where substitutions within a group is a “conservative substitution”.Nonpolar and/or Negatively Aliphatic Polar, Positively Charged SideUncharged Aromatic Charged Side Side Chains Side Chains Side ChainsChains Chains Glycine Serine Phenylalanine Lysine Aspartate AlanineThreonine Tyrosine Arginine Glutamate Valine Cysteine TryptophanHistidine Leucine Methionine Isoleucine Asparagine Proline Glutamine

Nucleic Acid Hybridization

Comparative hybridization can be used to identify nucleic acids of theinvention, such as SEQ ID NO.: 2, including conservative variations ofnucleic acids of the invention, and this comparative hybridizationmethod is a preferred method of distinguishing nucleic acids of theinvention. In addition, target nucleic acids which hybridize to anucleic acid represented by SEQ ID NO: 2 under high, ultra-high andultra-ultra high stringency conditions are a feature of the invention.Examples of such nucleic acids include those with one or a few silent orconservative nucleic acid substitutions as compared to a given nucleicacid sequence.

A test nucleic acid is said to specifically hybridize to a probe nucleicacid when it hybridizes at least ½ as well to the probe as to theperfectly matched complementary target, i.e., with a signal to noiseratio at least ½ as high as hybridization of the probe to the targetunder conditions in which the perfectly matched probe binds to theperfectly matched complementary target with a signal to noise ratio thatis at least about 5×-10× as high as that observed for hybridization toany of the unmatched target nucleic acids.

Nucleic acids “hybridize” when they associate, typically in solution.Nucleic acids hybridize due to a variety of well characterizedphysico-chemical forces, such as hydrogen bonding, solvent-exclusion,base stacking and the like. An extensive guide to the hybridization ofnucleic acids is found in Tijssen (1993) Laboratory Techniques inBiochemistry and Molecular Biology—Hybridization with Nucleic AcidProbes part I chapter 2, “Overview of principles of hybridization andthe strategy of nucleic acid probe assays,” (Elsevier, New York), aswell as in Ausubel, supra. Hames and Higgins (1995) Gene Probes 1 IRLPress at Oxford University Press, Oxford, England, (Hames and Higgins 1)and Hames and Higgins (1995) Gene Probes 2 IRL Press at OxfordUniversity Press, Oxford, England (Hames and Higgins 2) provide detailson the synthesis, labeling, detection and quantification of DNA and RNA,including oligonucleotides.

An example of stringent hybridization conditions for hybridization ofcomplementary nucleic acids which have more than 106 complementaryresidues on a filter in a Southern or northern blot is 50% formalin with1 mg of heparin at 42° C., with the hybridization being carried outovernight. An example of stringent wash conditions is a 0.2×SSC wash at65° C. for 15 minutes (see, Sambrook, supra for a description of SSCbuffer). Often the high stringency wash is preceded by a low stringencywash to remove background probe signal. An example low stringency washis 2×SSC at 40° C. for 15 minutes. In general, a signal to noise ratioof 5× (or higher) than that observed for an unrelated probe in theparticular hybridization assay indicates detection of a specifichybridization.

“Stringent hybridization wash conditions” in the context of nucleic acidhybridization experiments such as Southern and northern hybridizationsare sequence dependent, and are different under different environmentalparameters. An extensive guide to the hybridization of nucleic acids isfound in Tijssen (1993), supra. and in Hames and Higgins, 1 and 2.Stringent hybridization and wash conditions can easily be determinedempirically for any test nucleic acid. For example, in determiningstringent hybridization and wash conditions, the hybridization and washconditions are gradually increased (e.g., by increasing temperature,decreasing salt concentration, increasing detergent concentration and/orincreasing the concentration of organic solvents such as formalin in thehybridization or wash), until a selected set of criteria are met. Forexample, in highly stringent hybridization and wash conditions, thehybridization and wash conditions are gradually increased until a probebinds to a perfectly matched complementary target with a signal to noiseratio that is at least 5× as high as that observed for hybridization ofthe probe to an unmatched target.

“Very stringent” conditions are selected to be equal to the thermalmelting point (T_(m)) for a particular probe. The T_(m) is thetemperature (under defined ionic strength and pH) at which 50% of thetest sequence hybridizes to a perfectly matched probe. For the purposesof the present invention, generally, “highly stringent” hybridizationand wash conditions are selected to be about 5° C. lower than the T_(m)for the specific sequence at a defined ionic strength and pH.

“Ultra high-stringency” hybridization and wash conditions are those inwhich the stringency of hybridization and wash conditions are increaseduntil the signal to noise ratio for binding of the probe to theperfectly matched complementary target nucleic acid is at least 10× ashigh as that observed for hybridization to any of the unmatched targetnucleic acids. A target nucleic acid which hybridizes to a probe undersuch conditions, with a signal to noise ratio of at least ½ that of theperfectly matched complementary target nucleic acid is said to bind tothe probe under ultra-high stringency conditions.

Similarly, even higher levels of stringency can be determined bygradually increasing the hybridization and/or wash conditions of therelevant hybridization assay. For example, those in which the stringencyof hybridization and wash conditions are increased until the signal tonoise ratio for binding of the probe to the perfectly matchedcomplementary target nucleic acid is at least 10×, 20×, 50×, 100×, or500× or more as high as that observed for hybridization to any of theunmatched target nucleic acids. A target nucleic acid which hybridizesto a probe under such conditions, with a signal to noise ratio of atleast ½ that of the perfectly matched complementary target nucleic acidis said to bind to the probe under ultra-ultra-high stringencyconditions.

Nucleic acids which do not hybridize to each other under stringentconditions are still substantially identical if the polypeptides whichthey encode are substantially identical. This occurs, e.g., when a copyof a nucleic acid is created using the maximum codon degeneracypermitted by the genetic code.

Unique Subsequences

In one aspect, the invention provides a nucleic acid that comprises aunique subsequence in a nucleic acid selected from the sequences ofO-tRNAs and O-RSs disclosed herein. The unique subsequence is unique ascompared to a nucleic acid corresponding to any known O-tRNA or O-RSnucleic acid sequence. Alignment can be performed using, e.g., BLAST setto default parameters. Any unique subsequence is useful, e.g., as aprobe to identify the nucleic acids of the invention.

Similarly, the invention includes a polypeptide which comprises a uniquesubsequence in a polypeptide selected from the sequences of O-RSsdisclosed herein. Here, the unique subsequence is unique as compared toa polypeptide corresponding to any of known polypeptide sequence.

The invention also provides for target nucleic acids which hybridizesunder stringent conditions to a unique coding oligonucleotide whichencodes a unique subsequence in a polypeptide selected from thesequences of O-RSs wherein the unique subsequence is unique as comparedto a polypeptide corresponding to any of the control polypeptides (e.g.,parental sequences from which synthetases of the invention were derived,e.g., by mutation). Unique sequences are determined as noted above.

Sequence Comparison, Identity, and Homology

The terms “identical” or percent “identity,” in the context of two ormore nucleic acid or polypeptide sequences, refer to two or moresequences or subsequences that are the same or have a specifiedpercentage of amino acid residues or nucleotides that are the same, whencompared and aligned for maximum correspondence, as measured using oneof the sequence comparison algorithms described below (or otheralgorithms available to persons of skill) or by visual inspection.

The phrase “substantially identical,” in the context of two nucleicacids or polypeptides (e.g., DNAs encoding an O-tRNA or O-RS, or theamino acid sequence of an O-RS) refers to two or more sequences orsubsequences that have at least about 60%, about 80%, about 90-95%,about 98%, about 99% or more nucleotide or amino acid residue identity,when compared and aligned for maximum correspondence, as measured usinga sequence comparison algorithm or by visual inspection. Such“substantially identical” sequences are typically considered to be“homologous,” without reference to actual ancestry. Preferably, the“substantial identity” exists over a region of the sequences that is atleast about 50 residues in length, more preferably over a region of atleast about 100 residues, and most preferably, the sequences aresubstantially identical over at least about 150 residues, or over thefull length of the two sequences to be compared.

Proteins and/or protein sequences are “homologous” when they arederived, naturally or artificially, from a common ancestral protein orprotein sequence. Similarly, nucleic acids and/or nucleic acid sequencesare homologous when they are derived, naturally or artificially, from acommon ancestral nucleic acid or nucleic acid sequence. For example, anynaturally occurring nucleic acid can be modified by any availablemutagenesis method to include one or more selector codon. Whenexpressed, this mutagenized nucleic acid encodes a polypeptidecomprising one or more redox active amino acid, e.g. unnatural aminoacid. The mutation process can, of course, additionally alter one ormore standard codon, thereby changing one or more standard amino acid inthe resulting mutant protein as well. Homology is generally inferredfrom sequence similarity between two or more nucleic acids or proteins(or sequences thereof). The precise percentage of similarity betweensequences that is useful in establishing homology varies with thenucleic acid and protein at issue, but as little as 25% sequencesimilarity is routinely used to establish homology. Higher levels ofsequence similarity, e.g., 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, or99% or more, can also be used to establish homology. Methods fordetermining sequence similarity percentages (e.g., BLASTP and BLASTNusing default parameters) are described herein and are generallyavailable.

For sequence comparison and homology determination, typically onesequence acts as a reference sequence to which test sequences arecompared. When using a sequence comparison algorithm, test and referencesequences are input into a computer, subsequence coordinates aredesignated, if necessary, and sequence algorithm program parameters aredesignated The sequence comparison algorithm then calculates the percentsequence identity for the test sequence(s) relative to the referencesequence, based on the designated program parameters.

Optimal alignment of sequences for comparison can be conducted, e.g., bythe local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482(1981), by the homology alignment algorithm of Needleman & Wunsch, J.Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson& Lipman, Proc. Nat'l. Acad. Sci. USA 85:2444 (1988), by computerizedimplementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA inthe Wisconsin Genetics Software Package, Genetics Computer Group, 575Science Dr., Madison, Wis.), or by visual inspection (see generallyAusubel et al., infra).

One example of an algorithm that is suitable for determining percentsequence identity and sequence similarity is the BLAST algorithm, whichis described in Altschul et al., J. Mol. Biol. 215:403-410 (1990).Software for performing BLAST analyses is publicly available through theNational Center for Biotechnology Information (www.ncbi.nlm.nih.gov/).This algorithm involves first identifying high scoring sequence pairs(HSPs) by identifying short words of length W in the query sequence,which either match or satisfy some positive-valued threshold score Twhen aligned with a word of the same length in a database sequence. T isreferred to as the neighborhood word score threshold (Altschul et al.,supra). These initial neighborhood word hits act as seeds for initiatingsearches to find longer HSPs containing them. The word hits are thenextended in both directions along each sequence for as far as thecumulative alignment score can be increased. Cumulative scores arecalculated using, for nucleotide sequences, the parameters M (rewardscore for a pair of matching residues; always >0) and N (penalty scorefor mismatching residues; always <0). For amino acid sequences, ascoring matrix is used to calculate the cumulative score. Extension ofthe word hits in each direction are halted when: the cumulativealignment score falls off by the quantity X from its maximum achievedvalue; the cumulative score goes to zero or below, due to theaccumulation of one or more negative-scoring residue alignments; or theend of either sequence is reached. The BLAST algorithm parameters W, T,and X determine the sensitivity and speed of the alignment. The BLASTNprogram (for nucleotide sequences) uses as defaults a wordlength (W) of11, an expectation (L) of 10, a cutoff of 100, M=5, N=4, and acomparison of both strands. For amino acid sequences, the BLASTP programuses as defaults a wordlength (W) of 3, an expectation (E) of 10, andthe BLOSUM62 scoring matrix (see Henikoff & Henikoff (1989) Proc. Natl.Acad. Sci. USA 89:10915).

In addition to calculating percent sequence identity, the BLASTalgorithm also performs a statistical analysis of the similarity betweentwo sequences (see, e.g., Karlin & Altschul, Proc. Nat'l. Acad. Sci. USA90:5873-5787 (1993)). One measure of similarity provided by the BLASTalgorithm is the smallest sum probability (P(N)), which provides anindication of the probability by which a match between two nucleotide oramino acid sequences would occur by chance. For example, a nucleic acidis considered similar to a reference sequence if the smallest sumprobability in a comparison of the test nucleic acid to the referencenucleic acid is less than about 0.1, more preferably less than about0.01, and most preferably less than about 0.001.

Mutagenesis and Other Molecular Biology Techniques

Polynucleotide and polypeptides of the invention and used in theinvention can be manipulated using molecular biological techniques.General texts which describe molecular biological techniques includeBerger and Kimmel, Guide to Molecular Cloning Techniques. Methods inEnzymology volume 152 Academic Press, Inc., San Diego, Calif. (Berger);Sambrook et al., Molecular Cloning—A Laboratory Manual (3rd Ed.). Vol.1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y., 2001(“Sambrook”) and Current Protocols in Molecular Biology, F. M. Ausubelet al., eds., Current Protocols, a joint venture between GreenePublishing Associates, Inc. and John Wiley & Sons, Inc., (supplementedthrough 2003) (“Ausubel”)). These texts describe mutagenesis, the use ofvectors, promoters and many other relevant topics related to, e.g., thegeneration of genes that include selector codons for production ofproteins that include redox active amino acids (e.g., DHP), orthogonaltRNAs, orthogonal synthetases, and pairs thereof.

Various types of mutagenesis are used in the invention, e.g., to mutatetRNA molecules, to produce libraries of tRNAs, to produce libraries ofsynthetases, to insert selector codons that encode a redox active aminoacid in a protein or polypeptide of interest. They include but are notlimited to site-directed, random point mutagenesis, homologousrecombination, DNA shuffling or other recursive mutagenesis methods,chimeric construction, mutagenesis using uracil containing templates,oligonucleotide-directed mutagenesis, phosphorothioate-modified DNAmutagenesis, mutagenesis using gapped duplex DNA or the like, or anycombination thereof. Additional suitable methods include point mismatchrepair, mutagenesis using repair-efficient host strains,restriction-selection and restriction-purification, deletionmutagenesis, mutagenesis by total gene synthesis, double-strand breakrepair, and the like. Mutagenesis, e.g., involving chimeric constructs,is also included in the present invention. In one embodiment,mutagenesis can be guided by known information of the naturallyoccurring molecule or altered or mutated naturally occurring molecule,e.g., sequence, sequence comparisons, physical properties, crystalstructure or the like.

Host cells are genetically engineered (e.g., transformed, transduced ortransfected) with the polynucleotides of the invention or constructswhich include a polynucleotide of the invention, e.g., a vector of theinvention, which can be, for example, a cloning vector or an expressionvector. For example, the coding regions for the orthogonal tRNA, theorthogonal tRNA synthetase, and the protein to be derivatized areoperably linked to gene expression control elements that are functionalin the desired host cell. Typical vectors contain transcription andtranslation terminators, transcription and translation initiationsequences, and promoters useful for regulation of the expression of theparticular target nucleic acid. The vectors optionally comprise genericexpression cassettes containing at least one independent terminatorsequence, sequences permitting replication of the cassette ineukaryotes, or prokaryotes, or both (e.g., shuttle vectors) andselection markers for both prokaryotic and eukaryotic systems. Vectorsare suitable for replication and/or integration in prokaryotes,eukaryotes, or preferably both. See Giliman & Smith, Gene 8:81 (1979);Roberts, et al., Nature, 328:731 (1987); Schneider, B., et al., ProteinExpr. Purif. 6435:10 (1995); Ausubel, Sambrook, Berger (all supra). Thevector can be, for example, in the form of a plasmid, a bacterium, avirus, a naked polynucleotide, or a conjugated polynucleotide. Thevectors are introduced into cells and/or microorganisms by standardmethods including electroporation (From et al., Proc. Natl. Acad. Sci.USA 82, 5824 (1985), infection by viral vectors, high velocity ballisticpenetration by small particles with the nucleic acid either within thematrix of small beads or particles, or on the surface (Klein et al.,Nature 327, 70-73 (1987)), and/or the like.

A catalogue of Bacteria and Bacteriophages useful for cloning isprovided, e.g., by the ATCC, e.g., The ATCC Catalogue of Bacteria andBacteriophage (1996) Gherna et al. (eds) published by the ATCC.Additional basic procedures for sequencing, cloning and other aspects ofmolecular biology and underlying theoretical considerations are alsofound in Sambrook (supra), Ausubel (supra), and in Watson et al. (1992)Recombinant DNA Second Edition Scientific American Books, NY. Inaddition, essentially any nucleic acid (and virtually any labelednucleic acid, whether standard or non-standard) can be custom orstandard ordered from any of a variety of commercial sources, such asthe Midland Certified Reagent Company (Midland, Tex. mcrc.com), TheGreat American Gene Company (Ramona, Calif. available on the World WideWeb at genco.com), ExpressGen Inc. (Chicago, Ill. available on the WorldWide Web at expressgen.com), Operon Technologies Inc. (Alameda, Calif.)and many others.

The engineered host cells can be cultured in conventional nutrient mediamodified as appropriate for such activities as, for example, screeningsteps, activating promoters or selecting transformants. These cells canoptionally be cultured into transgenic organisms. Other usefulreferences, e.g. for cell isolation and culture (e.g., for subsequentnucleic acid isolation) include Freshney (1994) Culture of Animal Cells,a Manual of Basic Technique, third edition, Wiley-Liss, New York and thereferences cited therein; Payne et al. (1992) Plant Cell and TissueCulture in liquid Systems John Wiley & Sons, Inc. New York, N.Y.;Gamborg and Phillips (eds) (1995) Plant Cell, Tissue and Organ Culture;Fundamental Methods Springer Lab Manual, Springer-Verlag (BerlinHeidelberg New York) and Atlas and Parks (eds) The Handbook ofMicrobiological Media (1993) CRC Press, Boca Raton, Fla.

Proteins and Polypeptides of Interest

One advantage of redox active amino acids are that they can be used toengineer electron transfer processes in protein. Other advantagesinclude, but are not limited to, that expression of redox activeproteins can facilitate the study and the ability to alter electrontransfer pathways in proteins, alter catalytic function of enzymes,crosslink protein with small molecules and biomolecules, etc. Proteinsor polypeptides of interest with at least one redox active amino acidare a feature of the invention. The invention also includes polypeptidesor proteins with at least redox active amino acid produced using thecompositions and methods of the invention. An excipient (e.g., apharmaceutically acceptable excipient) can also be present with theprotein. Optionally, a protein of the invention will include apost-translational modification.

Methods of producing a protein in a cell with a redox active amino acidat a specified position are also a feature of the invention. Forexample, a method includes growing, in an appropriate medium, the cell,where the cell comprises a nucleic acid that comprises at least oneselector codon and encodes a protein; and, providing the redox activeamino acid; where the cell further comprises: an orthogonal-tRNA(O-tRNA) that functions in the cell and recognizes the selector codon;and, an orthogonal aminoacyl-tRNA synthetase (O-RS) that preferentiallyaminoacylates the O-tRNA with the redox active amino acid. In certainembodiments, the O-tRNA comprises at least about, e.g., a 45%, a 50%, a60%, a 75%, a 80%, or a 90% or more suppression efficiency in thepresence of a cognate synthetase in response to the selector codon ascompared to the O-tRNA comprising or encoded by a polynucleotidesequence as set forth in SEQ ID NO.: 2. A protein produced by thismethod is also a feature of the invention.

The invention also provides compositions that include proteins, wherethe proteins comprise a redox active amino acid. In certain embodiments,the protein comprises an amino acid sequence that is at least 75%identical to that of a therapeutic protein, a diagnostic protein, anindustrial enzyme, or portion thereof.

The compositions of the invention and compositions made by the methodsof the invention optionally are in a cell. The O-tRNA/O-RS pairs orindividual components of the invention can then be used in a hostsystem's translation machinery, which results in a redox active aminoacid being incorporated into a protein. International Application NumberPCT/US2004/011786, filed Apr. 16, 2004, entitled “Expanding theEukaryotic Genetic Code;” and, WO 2002/085923, entitled “IN VIVOINCORPORATION OF UNNATURAL AMINO ACIDS” describe this process, and areincorporated herein by reference. For example, when an O-tRNA/O-RS pairis introduced into a host, e.g., Escherichia coli, the pair leads to thein vivo incorporation of redox active amino acid, such as DP, e.g., asynthetic amino acid, such as derivative of a tyrosine or phenyalanineamino acid, which can be exogenously added to the growth medium, into aprotein, in response to a selector codon. Optionally, the compositionsof the present invention can be in an in vitro translation system, or inan in vivo system(s).

A cell of the invention provides the ability to synthesize proteins thatcomprise unnatural amino acids in large useful quantities. In oneaspect, the composition optionally includes, e.g., at least 10micrograms, at least 50 micrograms, at least 75 micrograms, at least 100micrograms, at least 200 micrograms, at least 250 micrograms, at least500 micrograms, at least 1 milligram, at least 10 milligrams or more ofthe protein that comprises a redox active amino acid, or an amount thatcan be achieved with in vivo protein production methods (details onrecombinant protein production and purification are provided herein). Inanother aspect, the protein is optionally present in the composition ata concentration of, e.g., at least 10 micrograms of protein per liter,at least 50 micrograms of protein per liter, at least 75 micrograms ofprotein per liter, at least 100 micrograms of protein per liter, atleast 200 micrograms of protein per liter, at least 250 micrograms ofprotein per liter, at least 500 micrograms of protein per liter, atleast 1 milligram of protein per liter, or at least 10 milligrams ofprotein per liter or more, in, e.g., a cell lysate, a buffer, apharmaceutical buffer, or other liquid suspension (e.g., in a volume of,e.g., anywhere from about 1 nL to about 100 L). The production of largequantities (e.g., greater that that typically possible with othermethods, e.g., in vitro translation) of a protein in a cell including atleast one redox active amino acid is a feature of the invention.

The incorporation of a redox active amino acid can be done to, e.g.,tailor changes in protein structure and/or function, e.g., to changesize, acidity, nucleophilicity, hydrogen bonding, hydrophobicity,accessibility of protease target sites, target to a moiety (e.g., for aprotein array), etc. Proteins that include a redox active amino acid canhave enhanced or even entirely new catalytic or physical properties. Forexample, the following properties are optionally modified by inclusionof a redox active amino acid into a protein: toxicity, biodistribution,structural properties, spectroscopic properties, chemical and/orphotochemical properties, catalytic ability, half-life (e.g., serumhalf-life), ability to react with other molecules, e.g., covalently ornoncovalently, and the like. The compositions including proteins thatinclude at least one redox active amino acids are useful for, e.g.,novel therapeutics, diagnostics, catalytic enzymes, industrial enzymes,binding proteins (e.g., antibodies), and e.g., the study of proteinstructure and function. See, e.g., Dougherty, (2000) Unnatural AminoAcids as Probes of Protein Structure and Function, Current Opinion inChemical Biology, 4:645-652.

In one aspect of the invention, a composition includes at least oneprotein with at least one, e.g., at least two, at least three, at leastfour, at least five, at least six, at least seven, at least eight, atleast nine, or at least ten or more unnatural amino acids, e.g., redoxactive amino acids and/or other unnatural amino acids. The unnaturalamino acids can be the same or different, e.g., there can be 1, 2, 3, 4,5, 6, 7, 8, 9, or 10 or more different sites in the protein thatcomprise 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more different unnaturalamino acids. In another aspect, a composition includes a protein with atleast one, but fewer than all, of a particular amino acid present in theprotein is substituted with the redox active amino acid. For a givenprotein with more than one unnatural amino acids, the unnatural aminoacids can be identical or different (e.g., the protein can include twoor more different types of unnatural amino acids, or can include two ofthe same unnatural amino acid). For a given protein with more than twounnatural amino acids, the unnatural amino acids can be the same,different or a combination of a multiple unnatural amino acid of thesame kind with at least one different unnatural amino acid.

Essentially any protein (or portion thereof) that includes a redoxactive amino acid (and any corresponding coding nucleic acid, e.g.,which includes one or more selector codons) can be produced using thecompositions and methods herein. No attempt is made to identify thehundreds of thousands of known proteins, any of which can be modified toinclude one or more unnatural amino acid, e.g., by tailoring anyavailable mutation methods to include one or more appropriate selectorcodon in a relevant translation system. Common sequence repositories forknown proteins include GenBank EMBL, DDBJ and the NCBI. Otherrepositories can easily be identified by searching the internet.

Typically, the proteins are, e.g., at least 60%, at least 70%, at least75%, at least 80%, at least 90%, at least 95%, or at least 99% or moreidentical to any available protein (e.g., a therapeutic protein, adiagnostic protein, an industrial enzyme, or portion thereof, and thelike), and they comprise one or more unnatural amino acid. Examples oftherapeutic, diagnostic, and other proteins that can be modified tocomprise one or more redox active amino acid can be found, but notlimited to, those in International Application Number PCT/US2004/011786,filed Apr. 16, 2004, entitled “Expanding the Eukaryotic Genetic Code;”and, WO 2002/085923, entitled “IN VIVO INCORPORATION OF UNNATURAL AACIDS.” Examples of therapeutic, diagnostic, and other proteins that canbe modified to comprise one or more redox active amino acids include,but are not limited to, e.g., Alpha-1 antitrypsin, Angiostatin,Antihemolytic factor, antibodies (further details on antibodies arefound below), Apolipoprotein, Apoprotein, Atrial natriuretic factor,Atrial natriuretic polypeptide, Atrial peptides, C-X-C chemokines (e.g.,T39765, NAP-2, ENA-78, Gro-a, Gro-b, Gro-c, IP-10, GCP-2, NAP-4, SDF-1,PF4, MIG), Calcitonin, CC chemokines (e.g., Monocyte chemoattractantprotein-1, Monocyte chemoattractant protein-2, Monocyte chemoattractantprotein-3, Monocyte inflammatory protein-1-alpha, Monocyte inflammatoryprotein-1 beta, RANTES, 1309, R83915, R91733, HCC1, T58847, D31065,T64262), CD40 ligand, C-kit Ligand, Collagen, Colony stimulating factor(CSF), Complement factor 5a, Complement inhibitor, Complement receptor1, cytokines, (e.g., epithelial Neutrophil Activating Peptide-78,GROα/MGSA, GROβ, GROγ, MIP-1α, MIP-1δ, MCP-1), Epidermal Growth Factor(EGF), Erythropoietin (“EPO”), Exfoliating toxins A and B, Factor IX,Factor VII, Factor VIII, Factor X, Fibroblast Growth Factor (FGF),Fibrinogen, Fibronectin, G-CSF, GM-CSF, Glucocerebrosidase,Gonadotropin, growth factors, Hedgehog proteins (e.g., Sonic, Indian,Desert), Hemoglobin, Hepatocyte Growth Factor (HGF), Hirudin, Humanserum albumin, Insulin, Insulin-like Growth Factor (IGF), interferons(e.g., IFN-α, IFN-β, IFN-γ), interleukins (e.g., IL-1, IL-2, IL-3, IL-4,IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-11, IL-12, etc.), KeratinocyteGrowth Factor (KGF), Lactoferrin, leukemia inhibitory factor,Luciferase, Neurturin, Neutrophil inhibitory factor (NIF), oncostatin M,Osteogenic protein, Parathyroid hormone, PD-ECSF, PDGF, peptide hormones(e.g., Human Growth Hormone), Pleiotropin, Protein A, Protein G,Pyrogenic exotoxins A, B, and C, Relaxin, Renin, SCF, Soluble complementreceptor I, Soluble I-CAM 1, Soluble interleukin receptors 1, 2, 3, 4,5, 6, 7, 9, 10, 11, 12, 13, 14, 15), Soluble TNF receptor, Somatomedin,Somatostatin, Somatotropin, Streptokinase, Superantigens, i.e.,Staphylococcal enterotoxins (SEA, SEB, SEC1, SEC2, SEC3, SED, SEE),Superoxide dismutase (SOD), Toxic shock syndrome toxin (TSST-1),Thymosin alpha 1, Tissue plasminogen activator, Tumor necrosis factorbeta (TNF beta), Tumor necrosis factor receptor (TNFR), Tumor necrosisfactor-alpha (TNF alpha), Vascular Endothelial Growth Factor (VEGEF),Urokinase and many others.

One class of proteins that can be made using the compositions andmethods for in vivo incorporation of redox active amino acids describedherein includes transcriptional modulators or a portion thereof. Exampletranscriptional modulators include genes and transcriptional modulatorproteins that modulate cell growth, differentiation, regulation, or thelike. Transcriptional modulators are found in prokaryotes, viruses, andeukaryotes, including fungi, plants, yeasts, insects, and animals,including mammals, providing a wide range of therapeutic targets. Itwill be appreciated that expression and transcriptional activatorsregulate transcription by many mechanisms, e.g., by binding toreceptors, stimulating a signal transduction cascade, regulatingexpression of transcription factors, binding to promoters and enhancers,binding to proteins that bind to promoters and enhancers, unwinding DNA,splicing pre-mRNA, polyadenylating RNA, and degrading RNA.

One class of proteins of the invention (e.g., proteins with one or moreredox active amino acids) include expression activators such ascytokines, inflammatory molecules, growth factors, their receptors, andoncogene products, e.g., interleukins (e.g., IL-1, IL-2, IL-8, etc.),interferons, FGF, IGP-I, IGF-II, FGF, PDGF, TNF, TGF-α, TGF-β, EGF, KGF,SCF/c-Kit, CD40L/CD40, VLA-4/VCAM-1, ICAM-1/LFA-1, and hyalurin/CD44;signal transduction molecules and corresponding oncogene products, e.g.,Mos, Ras, Raf, and Met; and transcriptional activators and suppressors,e.g., p53, Tat, Fos, Myc, Jun, Myb, Rel, and steroid hormone receptorssuch as those for estrogen, progesterone, testosterone, aldosterone, theLDL receptor ligand and corticosterone.

Enzymes (e.g., industrial enzymes) or portions thereof with at least oneredox active amino acid are also provided by the invention. Examples ofenzymes include, but are not limited to, e.g., amidases, amino acidracemases, acylases, dehalogenases, dioxygenases, diarylpropaneperoxidases, epimerases, epoxide hydrolases, esterases, isomerases,kinases, glucose isomerases, glycosidases, glycosyl transferases,haloperoxidases, monooxygenases (e.g., p450s), lipases, ligninperoxidases, nitrile hydratases, nitrilases, proteases, phosphatases,subtilisins, transaminase, and nucleases.

Many of these proteins' are commercially available (See, e.g., the SigmaBioSciences 2002 catalogue and price list), and the correspondingprotein sequences and genes and, typically, many variants thereof, arewell-known (see, e.g., Genbank). Any of them can be modified by theinsertion of one or more redox active amino acid according to theinvention, e.g., to alter the protein with respect to one or moretherapeutic, diagnostic or enzymatic properties of interest. Examples oftherapeutically relevant properties include serum half-life, shelfhalf-life, stability, immunogenicity, therapeutic activity,detectability (e.g., by the inclusion of reporter groups (e.g., labelsor label binding sites) in the unnatural amino acids, e.g., redox activeamino acids), reduction of LD₅₀ or other side effects, ability to enterthe body through the gastric tract (e.g., oral availability), or thelike. Examples of diagnostic properties include shelf half-life,stability, diagnostic activity, detectability, or the like. Examples ofrelevant enzymatic properties include shelf half-life, stability,enzymatic activity, production capability, or the like.

A variety of other proteins can also be modified to include one or moreredox active amino acid of the invention. For example, the invention caninclude substituting one or more natural amino acids in one or morevaccine proteins with a redox active amino acid, e.g., in proteins frominfectious fungi, e.g., Aspergillus, Candida species; bacteria,particularly E. coli, which serves a model for pathogenic bacteria, aswell as medically important bacteria such as Staphylococci (e.g.,aureus), or Streptococci (e.g., pneumoniae); protozoa such as sporozoa(e.g., Plasmodia), rhizopods (e.g., Entanoeba) and flagellates(Trypanosoma, Leishmania, Trichomonas, Giardia, etc.); viruses such as(+) RNA viruses (examples include Poxviruses e.g., vaccinia;Picornaviruses, e.g. polio; Togaviruses, e.g., rubella; Flaviviruses,e.g., HCV; and Coronaviruses), (−) RNA viruses (e.g., Rhabdoviruses,e.g., VSV; Paramyxovimses, e.g., RSV; Orthomyxovimses, e.g., influenza;Bunyaviruses; and Arenaviruses), dsDNA viruses (Reoviruses, forexample), RNA to DNA viruses, i.e., Retroviruses, e.g., HIV and HTLV,and certain DNA to RNA viruses such as Hepatitis B.

Agriculturally related proteins such as insect resistance proteins(e.g., the Cry proteins), starch and lipid production enzymes, plant andinsect toxins, toxin-resistance proteins, Mycotoxin detoxificationproteins, plant growth enzymes (e.g., Ribulose 1,5-BisphosphateCarboxylase/Oxygenase, “RUBISCO”), lipoxygenase (LOX), andPhosphoenolpyruvate (EP) carboxylase are also suitable targets for redoxactive amino acid modification.

In certain embodiments, the protein or polypeptide of interest (orportion thereof) in the methods and/or compositions of the invention isencoded by a nucleic acid. Typically, the nucleic acid comprises atleast one selector codon, at least two selector codons, at least threeselector codons, at least four selector codons, at least five selectorcodons, at least six selector codons, at least seven selector codons, atleast eight selector codons, at least nine selector codons, ten or moreselector codons.

Genes coding for proteins or polypeptides of interest can be mutagenizedusing methods well-known to one of skill in the art and described hereinunder “Mutagenesis and Other Molecular Biology Techniques” to include,e.g., one or more selector codon for the incorporation of a redox activeamino acid. For example, a nucleic acid for a protein of interest ismutagenized to include one or more selector codon, providing for theinsertion of the one or more redox active amino acids. The inventionincludes any such variant, e.g., mutant, versions of any protein, e.g.,including at least one redox active amino acid. Similarly, the inventionalso includes corresponding nucleic acids, i.e., any nucleic acid withone or more selector codon that encodes one or more redox active aminoacid.

To make a protein that includes a redox active amino acid, one can usehost cells and organisms that are adapted for the in vivo incorporationof the redox active amino acid via orthogonal tRNA/RS pairs. Host cellsare genetically engineered (e.g., transformed, transduced ortransfected) with one or more vectors that express the orthogonal tRNA,the orthogonal tRNA synthetase, and a vector that encodes the protein tobe derivatized. Each of these components can be on the same vector, oreach can be on a separate vector, or two components can be on one vectorand the third component on a second vector. The vector can be, forexample, in the form of a plasmid, a bacterium, a virus, a nakedpolynucleotide, or a conjugated polynucleotide.

Defining Polypeptides by Immunoreactivity

Because the polypeptides of the invention provide a variety of newpolypeptide sequences (e.g., comprising redox active amino acids in thecase of proteins synthesized in the translation systems herein, or,e.g., in the case of the novel synthetases, novel sequences of standardamino acids), the polypeptides also provide new structural featureswhich can be recognized, e.g., in immunological assays. The generationof antisera, which specifically bind the polypeptides of the invention,as well as the polypeptides which are bound by such antisera, are afeature of the invention. The term “antibody,” as used herein, includes,but is not limited to a polypeptide substantially encoded by animmunoglobulin gene or immunoglobulin genes, or fragments thereof whichspecifically bind and recognize an analyte (antigen). Examples includepolyclonal, monoclonal, chimeric, and single chain antibodies, and thelike. Fragments of immunoglobulins, including Fab fragments andfragments produced by an expression library, including phage display,are also included in the term “antibody” as used herein. See, e.g.,Paul, Fundamental Immunology, 4th Ed., 1999, Raven Press, New York, forantibody structure and terminology.

In order to produce antisera for use in an immunoassay, one or more ofthe immunogenic polypeptides is produced and purified as describedherein. For example, recombinant protein can be produced in arecombinant cell. An inbred strain of mice (used in this assay becauseresults are more reproducible due to the virtual genetic identity of themice) is immunized with the immunogenic protein(s) in combination with astandard adjuvant, such as Freund's adjuvant, and a standard mouseimmunization protocol (see, e.g., Harlow and Lane (1988) Antibodies, ALaboratory Manual, Cold Spring Harbor Publications, New York, for astandard description of antibody generation, immunoassay formats andconditions that can be used to determine specific immunoreactivity.Additional details on proteins, antibodies, antisera, etc. can be foundin U.S. Ser. Nos. 60/479,931, 60/463,869, and 60/496,548 entitled“Expanding the Eukaryotic Genetic Code;” WO 2002/085923, entitled “INVIVO INCORPORATION OF UNNATURAL AMINO ACIDS;” patent applicationentitled “Glycoprotein synthesis” filed Jan. 16, 2003, U.S. Ser. No.60/441,450; and patent application entitled “Protein Arrays,” attorneydocket number P1001US00 filed on Dec. 22, 2002.

Use of O-tRNA and O-RS and O-tRNA/O-RS Pairs

The compositions of the invention and compositions made by the methodsof the invention optionally are in a cell. The O-tRNA/O-RS pairs orindividual components of the invention can then be used in a hostsystem's translation machinery, which results in a redox active aminoacid being incorporated into a protein. The corresponding patentapplication “In vivo Incorporation of Unnatural Amino, Acids”, WO2002/085923 by Schultz, et al. describes this process and isincorporated herein by reference. For example, when an O-tRNA/O-RS pairis introduced into a host, e.g., Escherichia coli, the pair leads to thein vivo incorporation of a redox active amino acid, which can beexogenously added to the growth medium, into a protein, e.g., myoglobinor a therapeutic protein, in response to a selector codon, e.g., anamber nonsense codon. Optionally, the compositions of the invention canbe in an in vitro translation system, or in an in vivo system(s).Proteins with the redox active amino acid can be used as theraupeticproteins and can be used to alter catalytic function of enzymes and/orelectron transfer pathways in proteins, to crosslink protein with smallmolecules and/or biomolecules, and to facilitate studies on proteinstructure, interactions with other protein, electron transfer processesin proteins and the like.

Kits

Kits are also a feature of the invention. For example, a kit forproducing a protein that comprises at least one redox active amino acidin a cell is provided, where the kit includes a container containing apolynucleotide sequence encoding an O-tRNA, and/or an O-tRNA, and/or apolynucleotide sequence encoding an O-RS, and/or an O-RS. In oneembodiment, the kit further includes a redox active amino acid. Inanother embodiment, the kit further comprises instructional materialsfor producing the protein.

EXAMPLES

The following examples are offered to illustrate, but not to limit theclaimed invention. One of skill will recognize a variety of non-criticalparameters that may be altered without departing from the scope of theclaimed invention.

Example 1 Site-Specific Incorporation of a Redox Active Amino Acid intoProteins

Recently it has been shown reported that a number of unnatural aminoacids can be incorporated selectively into proteins in E. coli and yeast(Wang et al. (2001) Science 292:498-500; Zhang et al. (2003)Biochemistry 42:6735-6746; Chin et al. (2003) Science 301:964-967) usingorthogonal tRNA-aminoacyl tRNA synthetase pairs. These orthogonal pairsdo not cross-react with endogenous components of the translationalmachinery of the host cell, but recognize the desired unnatural aminoacid and incorporate it into proteins in response to the amber nonsensecodon, TAG (Wang et al. (2000) J. Am. Chem. Soc., 122:5010-5011; Wangand Schultz (2001) Chem. Biol., 8:883-890). To genetically encode3,4-dihydroxy-L-phenylalanine (DHP; see compound 1 in FIG. 1) in E.coli, the specificity of an orthogonal Methanococcus jannaschiitRNA-synthetase (MjTyrRS; provided in FIG. 5 and Table 1, and also,amino acid sequence provided in SEQ ID NO:4 and nucleotide sequenceprovided in SEQ ID NO: 5) was altered so that the synthetaseaminoacylates the mutant tyrosine tRNA amber suppressor (mutRNA_(CUA)^(Tyr)) with DHP and not with any of the common twenty amino acids.These mutant synthetase's were selected from two mutant MjTyrRSlibraries (Wang et al. (2001) Science 292:498-500; Zhang et al. (2002)Angew. Chem. Int. Ed., 41:2840-2842). In the first library, which isbased on an analysis of the crystal structure of the homologous TyrRSfrom Bacillus stearothermophilus (Brick et al. (1989) J. Mol. Biol.,208:83-98), five residues (Tyr 32, Glu 107, Asp 158, Ile 159, and Leu162) in the active site of MjTyrRS that are within 6.5 Å of the paraposition of the aryl ring of tyrosine were randomly mutated (encoded onplasmid pBK-lib). In the second library six residues (Tyr32, Ala 67, His70, Gln 155, Asp 158, Ala 167) within 6.9 Å of the meta position of thetyrosine aryl ring were randomly mutated (encoded on plasmid pBK-lib-m).

To alter the specificity of the TyrRS so it specifically incorporatesDHP and none of the other natural amino acids, a genetic selection whichconsists of several rounds of positive and negative selection wasapplied. In the positive selection, both libraries of mutant TyrRS weresubjected to a selection scheme based on the suppression of an ambercodon introduced at a nonessential position (Asp112) in thechloroamphenicol acetyl transferase (CAT) gene (pRep(2)/YC). Cellstransformed with the mutant TyrRS libraries, the mutRNA_(CUA) ^(Tyr),gene, and the amber mutant CAT gene were grown in minimal mediacontaining 1 mM DHP and 70 μg/ml chloramphenicol under anaerobicconditions to avoid the oxidation of DHP. Surviving cells contain mutantTyrRSs that aminoacylate the mutRNA_(CUA) ^(Tyr) with either DHP orendogenous amino acids. Next, a negative selection was applied to removethe mutant TyrRSs that charge natural amino acids based on suppressionof three amber codons introduced at nonessential positions (Gln2, Asp44,GlyS5) in the toxic barnase gene (pLWJ17B3). Cells harboring the mutantTyrRSs from the previous positive selection, the mutRNA_(CUA) ^(Tyr),and the amber mutant barnase gene were grown in Luria-Bertani (LB) mediain the absence of DHP. Under these conditions, cells encoding mutantTyrRSs with specificity for endogenous amino acids will producefull-length barnase and die. Only those cells containing mutant TyrRSswith specificity for DHP can survive. After three rounds of positiveselection alternating with two rounds of negative selection, a clone wasevolved whose survival at high concentrations of chloroamphenicol (90mg/L) was dependent on the presence of DHP, the selected mutant TyrRSgene (DHPRS), mutRNA_(CUA) ^(Tyr), and the Asp112TAG CAT gene. However,in the absence of DHP, the same cells survived only in 20 mg/Lchloroamphenicol. This result suggests that the selected DHPRS enzymehas higher specificity for DHP than for natural amino acids. Sequencingrevealed the following mutants in the selected DHPRS: Tyr32→Leu,Ala67→Ser, His70→Asn, Ala167→Gln. The DHPRS synthetase is shown in FIG.6 and Table 1. Also, the amino acid sequence is provided in SEQ ID NO:1and the nucleotide sequence is provided in SEQ ID NO:3.

To measure the fidelity and efficiency of DHP incorporation, using theselected clone pDHPRS, we incorporated DHP in response to an amber codonat the surface exposed fourth residue in C-terminally hexahistidinetagged mutant sperm whale myoglobin (Mb; see Chin et al. (2002) Proc.Natl; Acad. Sci. U.S.A., 99:11020-11024). Full-length myoglobincontaining DHP (DHPMb) was expressed using GMML (glycerol minimal mediawith leucine) as the growth medium and under reducing conditions (100 μMdithiothreitol (DTT)), in order to prevent oxidation of DHP prior toincorporation into the protein. The yield of mutant protein wasapproximately 1 mg/liter (The yield of wild type Mb (wtMb) under thesame conditions is undetectable). No full-length Mb was expressed in theabsence of DHP; in the absence of DTT most cells died due to thetoxicity of the oxidized quinone (see compound 3 in FIG. 1). A fulllength DHPMb was purified using cobalt-based IMAC resin (immobilizedmetal affinity chromatography). The purified samples of the expressedmutant proteins in the presence and in the absence of DHP were loaded onan SDS-PAGE gel, for silver staining, and western blotting of the gel.Using anti His6-tag antibody, no full-length Mb was expressed in theabsence of either DHPRS or mutRNA_(CUA) ^(Tyr) (shown in FIG. 2A).Electronspray-ionization (ESI) with a quadrupole-quadrupoletime-of-flight (QqTOF) mass spectrometer was used to measure themolecular weight of the protein. FIG. 2B shows the ESI-QqTOF massspectrum of DHPMb with a mass of 18,448.5 Dalton. This is within 70p.p.m. from the calculated mass of 18447.2 Dalton for the DHP containingMb (a neighboring peak shows a mass of 18,432.3 Dalton due to a loss ofoxygen, or oxygen and proton caused, according to control experiments,by the measuring technique).

Cyclic voltammetry was used to determine whether the redox wave of theoxidized hydroquinone could be observed, when a bare gold electrode wasimmersed in a solution containing the DHPMb. FIG. 3A shows anirreversible voltammetric response of a solution containing the wtMB andthat of the DHPMb under anaerobic conditions (Bard and Faulkner, InElectrochemical Methods; John W. Wiley & Sons, Inc.: New York, 1980; pp213-248, 429-487 and 675-698). The reductive peak potential originatingfrom the wtMbFe(M) is observed at E=−320 mV, whereas the reductive peakpotential of the mutant protein is shifted to a more negative potentialE=400 mV. This shift is attributed to the presence of DHP, which mayfacilitate the reduction of Fe(III) at a much lower potential than inthe absence of DHP. The irreversible observed voltammograms are due to aslow electron transfer rate, which is likely to be derived from thelimited accessibility of the electron to the electrode. FIG. 3B showsthe voltammetic response of a solution containing 100 μM of DHP, wtMband the DHPMb. The current originating from DHP oxidation appears onlyin the presence of the mutated Mb or in a solution of free DHP with,E=580 mV and E=385 mV, respectively. These results show clearly thatthere is a significant influence of the presence of DHP in Mb on theredox potential of the Fe(III)-heme group and vice versa.

The description provided herein demonstrates that redox active aminoacids, e.g., DHP, can be efficiently and selectively incorporated intoproteins in an organism, e.g., E. coli. These amino acids can beoxidized electrochemically within the protein. The ability toincorporate redox active amino acids site specifically into proteins canfacilitate the study of electron transfer in proteins, as well as enablethe engineering of redox proteins with novel properties. Thesite-specific incorporation of redox active amino acids, e.g., DHP, intovarious sites in model proteins, e.g., Mb adn other proteins, can beused to study electron transfer pathways in this protein and others(Mayo et al. (1986) Science 233:948-952; Gray and Malmstrom (1989)Biochemistry 28:7499-7505).

Example 2 Exemplary O-RSs and O-tRNAs for the Incorporation of RedoxActive Amino Acids

An exemplary O-tRNA comprises SEQ ID NO.: 2 (See Table 1). Example O-RSsinclude the amino acid sequence provided in SEQ ID NO.: 1 (See Table 1)and FIG. 6. Examples of polynucleotides that encode O-RSs or portionsthereof include polynucleotides that encode an amino acid sequencecomprising SEQ ID NO.: 1. For example, the polynucleotide provided inFIG. 6 and SEQ ID NO:3 encode exemplary O-RSs.

It is understood that the examples and embodiments described herein arefor illustrative purposes only and that various modifications or changesin light thereof will be suggested to persons skilled in the art and areto be included within the spirit and purview of this application andscope of the appended claims.

While the foregoing invention has been described in some detail forpurposes of clarity and understanding, it will be clear to one skilledin the art from a reading of this disclosure that various changes inform and detail can be made without departing from the true scope of theinvention. For example, all the techniques and apparatus described abovecan be used in various combinations. All publications, patents, patentapplications, and/or other documents cited in this application areincorporated by reference in their entirety for all purposes to the sameextent as if each individual publication, patent, patent application,and/or other document were individually indicated to be incorporated byreference for all purposes. TABLE 1 SEQUENCES SEQ ID NO: DescriptionSEQUENCE 1 DHPRS (synthetase amino acidMDEFEMIKRNTSEIISEEELREVLKKDEKSALIG sequence, having amino acid changes:FEPSGKIHLGHYLQIKKMIDLQNAGFDIIILLSD Tyr32→Leu, Ala67→Ser, His70→Asn,LNAYLNQKGELDEIRKIGDYNKKVFEAMGLKAKY Ala167→Gln based on MethanococcusVYGSEFQLDKDYTLNVYRLALKTTLKRARRSMEL jannaschii tyrosine tRNA-synthetaseIAREDENPKVAEVIYPIMQVNDIHYLGVDVQVGG (MjTyrRS)MEQRKIHMLARELLPKKVVCIHNPVLTGLDGEGK MSSSKGNFIAVDDSPEEIRAKIKKAYCPAGVVEGNPIMEIAKYFLEYPLTIKRPEKFGGDLTVNSYEE LESLFKNKELHPMDLKNAVAEELIKILEPIRKRL 2mutRNA^(Tyr) _(CUA) CCGGCGGUAGUUCAGCAGGGCAGAACGGCGGACUCUAAAUCCGCAUGGCGCUGGUUCAAAUCCGGCCC GCCGGACCA 3 DHPRS (synthetase)nucleotide sequence, ATGGACGAATTTGAAATGATAAAGAGAAACACAT encoding aminoacid changes: CTGAAATTATCAGCGAGGAAGAGTTAAGAGAGGT Tyr32→Leu, Ala67→Ser,His70→Asn, TTTAAAAAAAGATGAAAAATCTGCTCTCATAGGT Ala167→Gln based onMethanococcus TTTGAACCAAGTGGTAAAATACATTTAGGGCATT jannaschii tyrosinetRNA-synthetase ATCTCCAAATAAAAAAGATGATTGATTTACAAAA (MjTyrRS)TGCTGGATTTGATATAATTATATTGTTGAGCGAT TTAAACGCCTATTTAAACCAGAAAGGAGAGTTGGATGAGATTAGAAAAATAGGAGATTATAACAAAAA AGTTTTTGAAGCAATGGGGTTAAAGGCAAAATATGTTTATGGAAGTGAATTCCAGCTTGATAAGGATT ATACACTGAATGTCTATAGATTGGCTTTAAAAACTACCTTAAAAAGAGCAAGAAGGAGTATGGAACTT ATAGCAAGAGAGGATGAAAATCCAAAGGTTGCTGAAGTTATCTATCCAATAATGCAGGTTAATGATAT TCATTATTTAGGCGTTGATGTTCAGGTTGGAGGGATGGAGCAGAGAAAAATACACATGTTAGCAAGGG AGCTTTTACCAAAAAAGGTTGTTTGTATTCACAACCCTGTCTTAACGGGTTTGGATGGAGAAGGAAAG ATGAGTTCTTCAAAAGGGAATTTTATAGCTGTTGATGACTCTCCAGAAGAGATTAGGGCTAAGATAAA GAAAGCATACTGCCCAGCTGGAGTTGTTGAAGGAAATCCAATAATGGAGATAGCTAAATACTTCCTTG AATATCCTTTAACCATAAAAAGGCCAGAAAAATTTGGTGGAGATTTGACAGTTAATAGCTATGAGGAG TTAGAGAGTTTATTTAAAAATAAGGAATTGCATCCAATGGATTTAAAAAATGCTGTAGCTGAAGAACT TATAAAGATTTTAGAGCCAATTAGAAAGAGATTA 4Methanococcus jannaschii tyrosine MDEFEMIKRNTSEIISEEELREVLKKDEKSAYIGtRNA-synthase (MjTyrRS) amino acid FEPSGKIHLGHYLQIKKMIDLQNAGFDIIILLADsequence LHAYLNQKGELDEIRKIGDYNKKVFEAMGLKAKYVYGSEFQLDKDYTLNVYRLALKTTLKRARRSMEL IAREDENPKVAEVIYPIMQVNDIHYLGVDVAVGGMEQRKIHMLARELLPKKVVCIHNPVLTGLDGEGK MSSSKGNFIAVDDSPEEIRAKIKKAYCPAGVVEGNPIMEIAKYFLEYPLTIKRPEKFGGDLTVNSYEE LESLFKNKELHPMDLKNAVAEELIKILEPIRKRL 5Methanococcus jannaschii tyrosine ATGGACGAATTTGAAATGATAAAGAGAAACACATtRNA-synthetase (MjTyrRS) nucleotide CTGAAATTATCAGCGAGGAAGAGTTAAGAGAGGTsequence TTTAAAAAAAGATGAAAAATCTGCTTACATAGGTTTTGAACCAAGTGGTAAAATACATTTAGGGCATT ATCTCCAAATAAAAAAGATGATTGATTTACAAAATGCTGGATTTGATATAATTATATTGTTGGCTGAT TTACACGCCTATTTAAACCAGAAAGGAGAGTTGGATGAGATTAGAAAAATAGGAGATTATAACAAAAA AGTTTTTGAAGCAATGGGGTTAAAGGCAAAATATGTTTATGGAAGTGAATTCCAGCTTGATAAGGATT ATACACTGAATGTCTATAGATTGGCTTTAAAAACTACCTTAAAAAGAGCAAGAAGGAGTATGGAACTT ATAGCAAGAGAGGATGAAAATCCAAAGGTTGCTGAAGTTATCTATCCAATAATGCAGGTTAATGATAT TCATTATTTAGGCGTTGATGTTGCAGTTGGAGGGATGGAGCAGAGAAAAATACACATGTTAGCAAGGG AGCTTTTACCAAAAAAGGTTGTTTGTATTCACAACCCTGTCTTAACGGGTTTGGATGGAGAAGGAAAG ATGAGTTCTTCAAAAGGGAATTTTATAGCTGTTGATGACTCTCCAGAAGAGATTAGGGCTAAGATAAA GAAAGCATACTGCCCAGCTGGAGTTGTTGAAGGAAATCCAATAATGGAGATAGCTAAATACTTCCTTG AATATCCTTTAACCATAAAAAGGCCAGAAAAATTTGGTGGAGATTTGACAGTTAATAGCTATGAGGAG TTAGAGAGTTTATTTAAAAATAAGGAATTGCATCCAATGGATTTAAAAAATGCTGTAGCTGAAGAACT TATAAAGATTTTAGAGCCAATTAGAAAGAGATTA

1. A composition comprising an orthogonal aminoacyl-tRNA synthetase(O-RS), wherein the O-RS preferentially aminoacylates an O-tRNA with aredox active amino acid.
 2. The composition of claim 1, wherein the O-RScomprises an amino acid sequence comprising SEQ ID NO.: 1, or aconservative variation thereof.
 3. The composition of claim 1, whereinthe O-RS preferentially aminoacylates the O-tRNA with an efficiency ofat least 50% of the efficiency of a polypeptide comprising an amino acidsequence of SEQ ID NO.:
 1. 4. The composition of claim 1, wherein theO-RS is derived from a Methonococcus jannaschii.
 5. The composition ofclaim 1, comprising a cell.
 6. The composition of claim 5, wherein thecell is an E. coli cell.
 7. The composition of claim 1, comprising atranslation system.
 8. A cell comprising a translation system, whereinthe translation system comprises: an orthogonal-tRNA (O-tRNA); anorthogonal aminoacyl-tRNA synthetase (O-RS); and, a redox active aminoacid; wherein the O-tRNA recognizes a first selector codon, and the O-RSpreferentially aminoacylates the O-tRNA with the first redox activeamino acid.
 9. The cell of claim 8, wherein the O-RS preferentiallyaminoacylates the O-tRNA with an efficiency of at least 50% of theefficiency of a polypeptide comprising an amino acid sequence of SEQ IDNO.:
 1. 10. The cell of claim 8, wherein the O-tRNA comprises or isencoded by a polynucleotide sequence as set forth in SEQ ID NO.: 2, or acomplementary polynucleotide sequence thereof, and wherein the O-RScomprises an amino acid sequence comprising SEQ ID NO.: 1, or aconservative variation thereof.
 11. The cell of claim 8, wherein thecell further comprises an additional different O-tRNA/O-RS pair andunnatural amino acid, wherein the O-tRNA recognizes a second selectorcodon and the O-RS preferentially aminoacylates the O-tRNA with thesecond unnatural amino acid.
 12. The cell of claim 8, wherein the cellis a non-eukaryotic cell.
 13. The cell of claim 12, wherein thenon-eukaryotic cell is an E. coli cell.
 14. The cell of claim 8, furthercomprising a nucleic acid that comprises a polynucleotide that encodes apolypeptide of interest, wherein the polynucleotide comprises a selectorcodon that is recognized by the O-tRNA.
 15. An E. coli cell, comprising:an orthogonal tRNA (O-tRNA); an orthogonal aminoacyl-tRNA synthetase(O-RS), wherein the O-RS preferentially aminoacylates the O-tRNA with aredox active amino acid; the redox active amino acid; and, a nucleicacid that encodes a polypeptide of interest, wherein the nucleic acidcomprises the selector codon that is recognized by the O-tRNA.
 16. TheE. coli cell of claim 15, wherein the O-RS preferentially aminoacylatesthe O-tRNA with an efficiency of at least 50% of the efficiency of apolypeptide comprising an amino acid sequence of SEQ ID NO.:
 1. 17. Anartificial polypeptide comprising SEQ ID NO.
 1. 18. An artificialpolynucleotide that encodes a polypeptide of claim
 17. 19. A vectorcomprising or encoding a polynucleotide of claim
 18. 20. The vector ofclaim 19, wherein the vector comprises a plasmid, a cosmid, a phage, ora virus.
 21. The vector of claim 19, wherein the vector is an expressionvector.
 22. A cell comprising the vector of claim
 19. 23. A method foridentifying an orthogonal-aminoacyl-tRNA synthetase for use with aO-tRNA that utilizes a redox amino acid, the method comprising:subjecting to selection a population of cells of a first species,wherein the cells each comprise: 1) a member of a plurality ofaminoacyl-tRNA synthetases (RSs); 2) the orthogonal tRNA (O-tRNA)derived from one or more species; and, 3) a polynucleotide that encodesa selection marker and comprises at least one selector codon; whereincells that are enhanced in suppression efficiency as compared to cellslacking or comprising a reduced amount of the member of the plurality ofRSs that comprises an active RS that aminoacylates the O-tRNA; and,selecting the active RS that aminoacylates the O-tRNA with the redoxactive amino acid, thereby providing the orthogonal-aminoacyl-tRNAsynthetase for use with the O-tRNA.
 24. The method of claim 23, whereinthe selection comprises a positive selection and the selection markercomprises a positive selection marker.
 25. The method of claim 23,wherein the plurality of RSs comprise mutant RSs, RSs derived from oneor more species other than the first species or both mutant RSs and RSsderived from a species other than the first species.
 26. An orthogonalaminoacyl-tRNA synthetase identified by the method of claim
 23. 27. Amethod of producing a protein in a cell with a redox active amino acidat a specified position, the method comprising: growing, in anappropriate medium, the cell, where the cell comprises a nucleic acidthat comprises at least one selector codon and encodes a protein; and,providing the redox active amino acid; wherein the cell furthercomprises: an orthogonal-tRNA (O-tRNA) that functions in the cell andrecognizes the selector codon; and, an orthogonal aminoacyl-tRNAsynthetase (O-RS) that preferentially aminoacylates the O-tRNA with theredox active amino acid; and, incorporating the redox active amino acidinto the specified position in the protein during translation of thenucleic acid with the at least one selector codon, thereby producing theprotein.
 28. The method of claim 27, wherein the O-RS comprises a aminoacid sequence which comprises SEQ ID NO.:
 1. 29. The method of claim 27,wherein the cell is a non-eukaryotic cell.
 30. The method of claim 29,wherein the non-eukaryotic cell is an E. coli cell.
 31. A compositioncomprising a protein, wherein the protein comprises a redox active aminoacid.
 32. The composition of claim 31, wherein the redox active aminoacid is selected from the group consisting of: a3,4-dihydroxy-L-phenyalanine (DHP), a 3,4,5-trihydroxy-L-phenylalanine,a 3-nitro-tyrosine, a 4-nitro-phenylalanine, and a 3-thiol-tyrosine. 33.The composition of claim 31, wherein the protein comprises an amino acidsequence that is at least 75% identical to that of a wild-typetherapeutic protein, a diagnostic protein, an industrial enzyme, orportion thereof.
 34. The composition of claim 31, wherein thecomposition comprises a pharmaceutically acceptable carrier.